| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
| |
In the abstract interface of `JoinablePath`, replace `__str__()` with
`__vfspath__()`. This frees user implementations of `JoinablePath` to
implement `__str__()` however they like (or not at all.)
Also add `pathlib._os.vfspath()`, which calls `__fspath__()` or
`__vfspath__()`.
|
|
|
|
|
|
|
| |
(#133051)
Ensure that warnings about unspecified text encodings are emitted from
`ReadablePath.read_text()`, `WritablePath.write_text()` and `magic_open()`
with the correct stack level set.
|
|
|
|
|
|
|
|
| |
Add optional *add_scheme* argument to `urllib.request.pathname2url()`; when
set to true, a complete URL is returned. Likewise add optional
*require_scheme* argument to `url2pathname()`; when set to true, a complete
URL is accepted.
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
|
|
|
|
|
|
|
|
|
|
|
| |
In `urllib.request.url2pathname()`, if the authority resolves to the
current host, discard it. If an authority is present but resolves somewhere
else, then on Windows we return a UNC path (as before), and on other
platforms we raise `URLError`.
Affects `pathlib.Path.from_uri()` in the same way.
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
|
|
|
|
|
|
|
| |
In `JoinablePath.full_match()` and `ReadablePath.glob()`, accept a `str`
pattern argument rather than `JoinablePath | str`.
In `ReadablePath.copy()` and `copy_into()`, accept a `WritablePath` target
argument rather than `WritablePath | str`.
|
|
|
|
|
| |
When `pathlib._os.magic_open()` is called to open a path in binary mode,
raise `ValueError` if any of the *encoding*, *errors* or *newline*
arguments are given. This matches the `open()` built-in.
|
|
|
|
|
| |
For compatibility with `Path.glob()`, raise `ValueError` if an empty
pattern is given to `ReadablePath.glob()`.
|
|
|
|
|
|
| |
Call `urllib.request.pathname2url()` from `pathlib.Path.as_uri()`, and
deprecate the duplicate implementation in `PurePath`.
Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
|
|
|
|
|
|
| |
Call `urllib.request.url2pathname()` from `pathlib.Path.from_uri()` rather
than re-implementing it. This paves the way for solving the main issue
(ignoring local authorities and rejecting non-local ones) in urllib, not
pathlib.
|
|
|
|
|
|
|
|
|
|
|
| |
In the private pathlib ABCs, replace `_WritablePath._write_info()` with
`_WritablePath._copy_from()`. This provides the target path object with
more control over the copying process, including support for querying and
setting metadata *before* the path is created.
Adjust `_ReadablePath.copy()` so that it forwards its keyword arguments to
`_WritablePath._copy_from()` of the target path object. This allows us to
remove the unimplemented *preserve_metadata* argument in the ABC method,
making it a `Path` exclusive.
|
|
|
|
| |
Test copying from `Path` and `ReadableZipPath` (types of `_ReadablePath`)
to `Path` and `WritableZipPath` (types of `_WritablePath`).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove the *case_sensitive* argument from `_JoinablePath.full_match()` and
`_ReadablePath.glob()`. Using a non-native case sensitivity forces the use
of "case-pedantic" globbing, where we `iterdir()` even for non-wildcard
pattern segments. But it's hard to know when to enable this mode, as
case-sensitivity can vary by directory, so `_PathParser.normcase()` doesn't
always give the full picture. The `Path.glob()` implementation is forced to
make an educated guess, but we can avoid the issue in the ABCs by dropping
the *case_sensitive* argument.
(I probably shouldn't have added these arguments in `PurePath` and `Path`
in the first place!)
Also drop support for `_ReadablePath.glob(recurse_symlinks=False)`, which
makes recursive globbing much slower.
|
|
|
|
|
|
| |
In `pathlib.types._JoinablePath.full_match()`, treat alternate path
separators in the path and pattern as if they were primary separators. e.g.
if the parser is `ntpath`, then `P(r'foo/bar\baz').full_match(r'*\*/*')` is
true.
|
|
|
|
|
| |
In `pathlib.types._JoinablePath.with_name()`, retain any alternative path
separator preceding the old name, rather stripping and replacing it with a
primary separator. As a result, this method changes _only_ the name.
|
|
|
|
|
| |
The `pathlib` module used to import stuff from both `_abc` and `_local`,
but nowadays the `_local` module provides the entire public pathlib
implementation, so there's no reason for the indirection.
|
|
|
|
|
|
|
|
|
| |
There used to be a meaningful distinction between these modules: `pathlib`
imported `pathlib._abc` but not `pathlib.types`. This is no longer the
case (neither module is imported), so we move the ABCs as follows:
- `pathlib._abc.JoinablePath` --> `pathlib.types._JoinablePath`
- `pathlib._abc.ReadablePath` --> `pathlib.types._ReadablePath`
- `pathlib._abc.WritablePath` --> `pathlib.types._WritablePath`
|
|
|
|
|
|
|
|
| |
Remove the *mode*, *parents* and *exist_ok* arguments from
`WritablePath.mkdir()`. These arguments imply support for POSIX permissions
and checking for preexistence of the path or its parents, but subclasses of
`WritablePath` may not have these capabilities.
The public `Path.mkdir()` method retains these arguments.
|
|
|
|
|
|
| |
Remove `ReadablePath` methods duplicated by `ReadablePath.info`. To be
specific, we remove `exists()`, `is_dir()`, `is_file()` and `is_symlink()`.
The public `Path` class retains these methods.
|
|
|
|
|
|
|
|
| |
(#116392)" (#130743)
This broke tests on the 'aarch64 Fedora Stable Clang Installed 3.x' and
'AMD64 Fedora Stable Clang Installed 3.x' build bots.
This reverts commit da4899b94a9a9083fed4972b2473546e0d997727.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
## Filtered recursive walk
Expanding a recursive `**` segment entails walking the entire directory
tree, and so any subsequent pattern segments (except special segments) can
be evaluated by filtering the expanded paths through a regex. For example,
`glob.glob("foo/**/*.py", recursive=True)` recursively walks `foo/` with
`os.scandir()`, and then filters paths through a regex based on "`**/*.py`,
with no further filesystem access needed.
This fixes an issue where `glob()` could return duplicate results.
## Tracking path existence
We store a flag alongside each path indicating whether the path is
guaranteed to exist. As we process the pattern:
- Certain special pattern segments (`""`, `"."` and `".."`) leave the flag
unchanged
- Literal pattern segments (e.g. `foo/bar`) set the flag to false
- Wildcard pattern segments (e.g. `*/*.py`) set the flag to true (because
children are found via `os.scandir()`)
- Recursive pattern segments (e.g. `**`) leave the flag unchanged for the
root path, and set it to true for descendants discovered via
`os.scandir()`.
If the flag is false at the end, we call `lstat()` on each path to filter
out missing paths.
## Minor speed-ups
- Exclude paths that don't match a non-terminal non-recursive wildcard
pattern _prior_ to calling `is_dir()`.
- Use a stack rather than recursion to implement recursive wildcards.
- This fixes a recursion error when globbing deep trees.
- Pre-compile regular expressions and pre-join literal pattern segments.
- Convert to/from `bytes` (a minor use-case) in `iglob()` rather than
supporting `bytes` throughout. This particularly simplifies the code
needed to handle relative bytes paths with `dir_fd`.
- Avoid calling `os.path.join()`; instead we keep paths in a normalized
form and append trailing slashes when needed.
- Avoid calling `os.path.normcase()`; instead we use case-insensitive regex
matching.
## Implementation notes
Much of this functionality is already present in pathlib's implementation
of globbing. The specific additions we make are:
1. Support for `dir_fd`
2. Support for `include_hidden`
3. Support for generating paths relative to `root_dir`
This unifies the implementations of globbing in the `glob` and `pathlib`
modules.
Co-authored-by: Pieter Eendebak <pieter.eendebak@gmail.com>
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
|
|
|
| |
This feature isn't sufficiently motivated.
|
|
|
|
|
|
|
|
| |
Replace `WritablePath._copy_writer` with a new `_write_info()` method. This
method allows the target of a `copy()` to preserve metadata.
Replace `pathlib._os.CopyWriter` and `LocalCopyWriter` classes with new
`copy_file()` and `copy_info()` functions. The `copy_file()` function uses
`source_path.info` wherever possible to save on `stat()`s.
|
|
|
|
|
| |
In `pathlib.Path.copy()` and `move()`, return a fresh `Path` object with an
unpopulated `info` attribute, rather than a `Path` object with information
recorded *prior* to the path's creation.
|
|
|
|
|
|
|
| |
(#130422)
Call `ReadablePath.info.exists()` rather than `ReadablePath.exists()` when
globbing so that we use (or populate) the `info` cache.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the following methods, skip casting of the argument to a path object if
the argument has a `with_segments` attribute. In `PurePath`:
`relative_to()`, `is_relative_to()`, `match()`, and `full_match()`. In
`Path`: `rename()`, `replace()`, `copy()`, `copy_into()`, `move()`, and
`move_into()`.
Previously the check varied a bit from method to method. The `PurePath`
methods used `isinstance(arg, PurePath)`; the `rename()` and `replace()`
methods always cast, and the remaining `Path` methods checked for a private
`_copy_writer` attribute.
We apply identical changes to relevant methods of the private ABCs. This
improves performance a bit, because `isinstance()` checks on ABCs are
expensive.
|
|
|
|
| |
Remove `ReadablePath.rglob()` from the private pathlib ABCs. This method is
a trivial wrapper around `glob()` and easily replaced.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add the following private methods to `pathlib.Path.info`:
- `_posix_permissions()`: the POSIX file permissions (`S_IMODE(st_mode)`)
- `_file_id()`: the file ID (`(st_dev, st_ino)`)
- `_access_time_ns()`: the access time in nanoseconds (`st_atime_ns`)
- `_mod_time_ns()`: the modify time in nanoseconds (`st_mtime_ns`)
- `_bsd_flags()`: the BSD file flags (`st_flags`)
- `_xattrs()`: the file extended attributes as a list of key, value pairs,
or an empty list if `listxattr()` or `getxattr()` fail in an ignorable
way.
These methods replace `LocalCopyReader.read_metadata()`, and so we can
delete the `CopyReader` and `LocalCopyReader` classes. Rather than reading
metadata via `source._copy_reader.read_metadata()`, we instead call
`source.info._posix_permissions()`, `_access_time_ns()`, etc.
Preserving metadata is only supported for local-to-local copies at the
moment. To support copying metadata between arbitrary `ReadablePath` and
`WritablePath` objects, we'd need to make the new methods public and
documented.
Co-authored-by: Petr Viktorin <encukou@gmail.com>
|
|
|
|
|
|
|
|
|
|
| |
Remove the caching `_is_case_sensitive()` function.
The cache used to speed up `PurePath.[full_]match()` and `Path.[r]glob()`,
but that's no longer the case - these methods use
`self.parser is posixpath` to determine case sensitivity.
This makes the `pathlib._abc` module a little easier to backport to Python
3.8, where `functools.cache()` is unavailable.
|
|
|
|
|
|
|
|
|
|
|
| |
Convert `JoinablePath`, `ReadablePath` and `WritablePath` to real ABCs
derived from `abc.ABC`.
Make `JoinablePath.parser` abstract, rather than defaulting to `posixpath`.
Register `PurePath` and `Path` as virtual subclasses of the ABCs rather
than deriving. This avoids a hit to path object instantiation performance.
No change of behaviour in the public (non-abstract) classes.
|
|
|
|
|
|
|
|
|
|
|
|
| |
(#129856)
Move pathlib's private `CopyReader`, `LocalCopyReader`, `CopyWriter` and
`LocalCopyWriter` classes into `pathlib._os`, where they can live alongside
the low-level copying functions (`copyfileobj()` etc) and high-level path
querying interface (`PathInfo`).
This sets the stage for merging `LocalCopyReader` into `PathInfo`.
No change of behaviour; just moving some code around.
|
|
|
|
|
|
|
|
|
|
| |
In the private pathlib ABCs, make `ReadablePath.glob('')` yield a path with
a trailing slash (if it yields anything at all). As a result, `glob()`
works similarly to `joinpath()` when given a non-magic pattern.
In the globbing implementation, we preemptively add trailing slashes to
intermediate paths if there are pattern parts remaining; this removes the
need to check for existing trailing slashes (in the removed `add_slash()`
method) at subsequent steps.
|
|
|
|
|
| |
Add `pathlib.Path.info` attribute, which stores an object implementing the `pathlib.types.PathInfo` protocol (also new). The object supports querying the file type and internally caching `os.stat()` results. Path objects generated by `Path.iterdir()` are initialised with status information from `os.DirEntry` objects, which is gleaned from scanning the parent directory.
The `PathInfo` protocol has four methods: `exists()`, `is_dir()`, `is_file()` and `is_symlink()`.
|
|
|
|
|
|
|
| |
Unlike `ReadablePath.[r]glob()` and `JoinablePath.full_match()`, the
`JoinablePath.match()` method doesn't support the recursive wildcard `**`,
and matches from the right when a fully relative pattern is given. These
quirks means its probably unsuitable for inclusion in the pathlib ABCs,
especially given `full_match()` handles the same use case.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(#129014)
In the private pathlib ABCs, support write-only virtual filesystems by
making `WritablePath` inherit directly from `JoinablePath`, rather than
subclassing `ReadablePath`.
There are two complications:
- `ReadablePath.open()` applies to both reading and writing
- `ReadablePath.copy` is secretly an object that supports the *read* side
of copying, whereas `WritablePath.copy` is a different kind of object
supporting the *write* side
We untangle these as follow:
- A new `pathlib._abc.magic_open()` function replaces the `open()` method,
which is dropped from the ABCs but remains in `pathlib.Path`. The
function works like `io.open()`, but additionally accepts objects with
`__open_rb__()` or `__open_wb__()` methods as appropriate for the mode.
These new dunders are made abstract methods of `ReadablePath` and
`WritablePath` respectively. If the pathlib ABCs are made public, we
could consider blessing an "openable" protocol and supporting it in
`io.open()`, removing the need for `pathlib._abc.magic_open()`.
- `ReadablePath.copy` becomes a true method, whereas `WritablePath.copy` is
deleted. A new `ReadablePath._copy_reader` property provides a
`CopyReader` object, and similarly `WritablePath._copy_writer` is a
`CopyWriter` object. Once GH-125413 is resolved, we'll be able to move
the `CopyReader` functionality into `ReadablePath.info` and eliminate
`ReadablePath._copy_reader`.
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the private pathlib ABCs, rename `PurePathBase` to `JoinablePath`, and
split `PathBase` into `ReadablePath` and `WritablePath`. This improves the
API fit for read-only virtual filesystems.
The split of `PathBase` entails a similar split of `CopyWorker` (implements
copying) and the test cases in `test_pathlib_abc`.
In a later patch, we'll make `WritablePath` inherit directly from
`JoinablePath` rather than `ReadablePath`. For a couple of reasons,
this isn't quite possible yet.
|
|
|
|
|
| |
These methods combine `_delete()` and `copy()`, but `_delete()` isn't part
of the public interface, and it's unlikely to be added until the pathlib
ABCs are made official, or perhaps even later.
|
|
|
|
|
|
|
|
| |
Remove `PurePathBase.relative_to()` and `is_relative_to()` because they
don't account for *other* being an entirely different kind of path, and
they can't use `__eq__()` because it's not on the `PurePathBase` interface.
Remove `PurePathBase.drive`, `root`, `is_absolute()` and `as_posix()`.
These are all too specific to local filesystems.
|
|
|
|
|
|
|
| |
Remove the `PathBase.stat()` method. Its use of the `os.stat_result` API,
with its 10 mandatory fields and low-level types, makes it an awkward fit
for virtual filesystems.
We'll look to add a `PathBase.info` attribute later - see GH-125413.
|
|
|
|
|
|
|
|
|
|
|
| |
(#127810)
Move 9 private `PathBase` attributes and methods into a new `CopyWorker`
class. Change `PathBase.copy` from a method to a `CopyWorker` instance.
The methods remain private in the `CopyWorker` class. In future we might
make some/all of them public so that user subclasses of `PathBase` can
customize the copying process (in particular reading/writing of metadata,)
but we'd need to make `PathBase` public first.
|
|
|
|
|
| |
From `PurePathBase` delete `_globber`, `_stack` and `_pattern_str`, and
from `PathBase` delete `_glob_selector`. This helps avoid an unpleasant
surprise for a users who try to use these names.
|
|
|
|
|
| |
Remove the `PurePathBase` initializer, and make `with_segments()` and
`__str__()` abstract. This allows us to drop the `_raw_paths` attribute,
and also the `Parser.join()` protocol method.
|
|
|
|
|
| |
This method helped us customise the `UnsupportedOperation` message
depending on the type. But we're aiming to make `PathBase` a proper ABC
soon, so `NotImplementedError` is the right exception to raise there.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove the following methods from `pathlib._abc.PathBase`:
- `expanduser()`
- `hardlink_to()`
- `touch()`
- `chmod()`
- `lchmod()`
- `owner()`
- `group()`
- `from_uri()`
- `as_uri()`
These operations aren't regularly supported in virtual filesystems, so they
don't win a place in the `PathBase` interface. (Some of them probably don't
deserve a place in `Path` :P.) They're quasi-abstract (except `lchmod()`),
and they're not called by other `PathBase` methods.
|
|
|
|
|
|
|
|
|
|
| |
(#127709)
Remove `PathBase.samefile()`, which is fairly specific to the local FS, and
relies on `stat()`, which we're aiming to remove from `PathBase`.
Also remove `PathBase.is_mount()`, `is_junction()`, `is_block_device()`,
`is_char_device()`, `is_fifo()` and `is_socket()`. These rely on POSIX
file type numbers that we're aiming to remove from the `PathBase` API.
|
|
|
|
|
|
|
|
|
|
| |
Change the default value of `PurePathBase.parser` from `ParserBase()` to
`posixpath`. As a result, user subclasses of `PurePathBase` and `PathBase`
use POSIX path syntax by default, which is very often desirable.
Move `pathlib._abc.ParserBase` to `pathlib._types.Parser`, and convert it
to a runtime-checkable protocol.
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
|
|
|
|
|
|
|
| |
Virtual filesystems don't always make a distinction between deleting files
and empty directories, and sometimes support deleting non-empty directories
in a single operation. Here we remove `PathBase.unlink()` and `rmdir()`,
leaving `_delete()` as the sole deletion method, now made abstract. I hope
to drop the underscore prefix later on.
|
|
|
|
|
|
|
|
|
| |
Remove our implementation of POSIX path resolution in `PathBase.resolve()`.
This functionality is rather fragile and isn't necessary in most cases. It
depends on `PathBase.stat()`, which we're looking to remove.
Also remove `PathBase.absolute()`. Many legitimate virtual filesystems lack
the notion of a 'current directory', so it's wrong to include in the basic
interface.
|
|
|
|
| |
These methods are obviated by `PathBase.move()`, which can move directories
and supports any `PathBase` object as a target.
|
|
|
|
|
|
|
|
|
|
| |
Remove documentation for `pathlib.Path.scandir()`, and rename the method to
`_scandir()`. In the private pathlib ABCs, make `iterdir()` abstract and
call it from `_scandir()`.
It's not worthwhile to add this method at the moment - see discussion:
https://discuss.python.org/t/ergonomics-of-new-pathlib-path-scandir/71721
Co-authored-by: Steve Dower <steve.dower@microsoft.com>
|
|
|
|
|
| |
These classmethods presume that the user has retained the original
`__init__()` signature, which may not be the case. Also, many virtual
filesystems don't provide current or home directories.
|