Mailing List Archive

Symlinks already present
There is any way to check if a directory is already symlinked, without
controlling every symlink viewing the link? That is a bit time
consuming, due I've two or three directory that can have a new symlink,
but I've to check on a list of 20-30000 symlinks to delete it and avoid
duplicates...
--
https://mail.python.org/mailman/listinfo/python-list
Re: Symlinks already present [ In reply to ]
> On 26 Jul 2020, at 14:03, Termoregolato <waste@is.invalid> wrote:
>
> ?There is any way to check if a directory is already symlinked,

No. None.

> without controlling every symlink viewing the link? That is a bit time consuming, due I've two or three directory that can have a new symlink, but I've to check on a list of 20-30000 symlinks to delete it and avoid duplicates...

Don’t you have control of the code that is adding the symlinks?

Barry

> --
> https://mail.python.org/mailman/listinfo/python-list
>

--
https://mail.python.org/mailman/listinfo/python-list
Re: Symlinks already present [ In reply to ]
Il 26/07/20 15:19, Barry ha scritto:

> No. None.

Sob :-) But thanks for confirm.

> Don’t you have control of the code that is adding the symlinks?

No, so I must traverse the directories where symlinks are, to
deduplicate them. There are some modes to minimize the work, but that
way could be the simplest.
--
https://mail.python.org/mailman/listinfo/python-list
Re: Symlinks already present [ In reply to ]
On 27/07/2020 00:56, Termoregolato wrote:
> There is any way to check if  a directory is already symlinked, without
> controlling every symlink viewing the link? That is a bit time
> consuming, due I've two or three directory that can have a new symlink,
> but I've to check on a list of 20-30000 symlinks to delete it and avoid
> duplicates...


Please review "os — Miscellaneous operating system interfaces"
(https://docs.python.org/3/library/os.html)
- in particular os.stat() - and thus os.stat_result,
and given what you appear to be doing, possibly os.scandir() and
os.DirEntry.


I built a test:

dir1
-- dir-link = symlink to dir2
dir2
-- real-file

>>> os.stat( "dir2", follow_symlinks=True )
os.stat_result(st_mode=16893, st_ino=2345143, st_dev=64773, st_nlink=2,
st_uid=1000, st_gid=1000, st_size=4096, st_atime=1595793224,
st_mtime=1595793223, st_ctime=1595793223)
>>> os.stat( "dir1/dir-link", follow_symlinks=True )
os.stat_result(st_mode=16893, st_ino=2345143, st_dev=64773, st_nlink=2,
st_uid=1000, st_gid=1000, st_size=4096, st_atime=1595793224,
st_mtime=1595793223, st_ctime=1595793223)
>>> os.stat( "dir1/dir-link", follow_symlinks=False )
os.stat_result(st_mode=41471, st_ino=2345146, st_dev=64773, st_nlink=1,
st_uid=1000, st_gid=1000, st_size=7, st_atime=1595793558,
st_mtime=1595793558, st_ctime=1595793558)

NB st_size
Size of the file in bytes, if it is a regular file or a symbolic
link. The size of a symbolic link is the length of the pathname it
contains, without a terminating null byte.

Thus, compare the results of the two calls to detect a difference.
--
Regards =dn
--
https://mail.python.org/mailman/listinfo/python-list
Re: Symlinks already present [ In reply to ]
Il 26/07/20 20:39, Dennis Lee Bieber ha scritto:

> Since symbolic links are essentially just short files containing the
> path to the eventual target file/directory, with an OS flag that the file
> is a link

Yes, I use them massively to give to a lot of directories a kind of
order, depending on their contents. It's simple to see if link is
broken, but not if they're duplicate

--
Pastrano
con un altro account
--
https://mail.python.org/mailman/listinfo/python-list
Re: Symlinks already present [ In reply to ]
Il 26/07/20 22:47, dn ha scritto:

> Thus, compare the results of the two calls to detect a difference.

I will try also another way, If I don't err symlinks and original
directory have the same inode number (I talk about Linux, where I'm
using the application). I've a lot of directories like this

abcd efgh .ab dc de

where last part can change depending on contents. The are symlinked in a
tree of a different dir, divided in many other directories, like

work/a/abcd efgh .ab dc de

where generally there are 5-50 links. So I could, if correct, walk the
directory, and keeping a small array with the inode numbers check if
these numbers are duplicated.

--
Pastrano
con un altro account
--
https://mail.python.org/mailman/listinfo/python-list
Re: Symlinks already present [ In reply to ]
On Tue, Jul 28, 2020 at 4:26 AM Termoregolato <waste@is.invalid> wrote:
>
> Il 26/07/20 20:39, Dennis Lee Bieber ha scritto:
>
> > Since symbolic links are essentially just short files containing the
> > path to the eventual target file/directory, with an OS flag that the file
> > is a link
>
> Yes, I use them massively to give to a lot of directories a kind of
> order, depending on their contents. It's simple to see if link is
> broken, but not if they're duplicate
>

Ah, I think I get what you're doing.

Do you need an efficient way to see if a single target directory has
multiple symlinks pointing to it, or are you trying to audit the
entire collection all at once? I don't think there's a neat way to do
the former, but the latter isn't too hard. Try something like this:

# Enumerate the target directories (the real/physical ones)
dirs = {dir: None for dir in os.listdir("....")}

# Iterate over the symlinks and see where they point
for link in os.listdir("...."):
dest = os.readlink(link)
if dirs[dest]:
print("DUPLICATE")
print(dirs[dest], link)
dirs[dest] = link

You can then also check if any are missing, by seeing if there are any
Nones left in the dirs dict.

Unfortunately there's no real way to shortcut this if you just want to
check one target directory. You'd still have to readlink() every
symlink to try to find them.

ChrisA
--
https://mail.python.org/mailman/listinfo/python-list
Re: Symlinks already present [ In reply to ]
On 2020-07-27 at 20:20:08 +0200,
Termoregolato <waste@is.invalid> wrote:

> Il 26/07/20 20:39, Dennis Lee Bieber ha scritto:
>
> > Since symbolic links are essentially just short files containing the
> > path to the eventual target file/directory, with an OS flag that the file
> > is a link
>
> Yes, I use them massively to give to a lot of directories a kind of order,
> depending on their contents. It's simple to see if link is broken, but not
> if they're duplicate

If you know where the symlinks can be, then find and collect them into a
dictionary whose keys are the *targets* and whose values are a list of
the symlinks that point to that target. Then it's easy to spot the
targets that have more than one symlink.

--
“Whoever undertakes to set himself up as a
judge of Truth and Knowledge is shipwrecked
by the laughter of the gods.” – Albert Einstein
Dan Sommers, http://www.tombstonezero.net/dan
--
https://mail.python.org/mailman/listinfo/python-list
Re: Symlinks already present [ In reply to ]
On 2020-07-27, Termoregolato <waste@is.invalid> wrote:
> Il 26/07/20 22:47, dn ha scritto:
>
>> Thus, compare the results of the two calls to detect a difference.
>
> I will try also another way, If I don't err symlinks and original
> directory have the same inode number (I talk about Linux, where I'm
> using the application).

You err. Symlinks are distinct i-nodes which are not the same i-node
as the destination. A symlink is basically a file containing a string
that is read and then used a path to another file.

If you create a "hard" link (ln without the '-s') then you end up a single
i-node that has entries in multiple directories.

[.old-Unix-guy story: Way back when, SunOS used to allow you (if root)
to create a hard link to a directory. It's not something you did a
second time.]





--
https://mail.python.org/mailman/listinfo/python-list
Re: Symlinks already present [ In reply to ]
Il 28/07/20 00:19, Grant Edwards ha scritto:

> You err.

I read it, I had to test. In effects, it was simple to test.

me@debsrv:~/tmp/test$ ln -s /home/me/mydir aaa
me@debsrv:~/tmp/test$ ln -s /home/me/mydir bbb

me@debsrv:~/tmp/test$ ls
aaa bbb

me@debsrv:~/tmp/test$ stat --format=%i /home/me/mydir
18481153
me@debsrv:~/tmp/test$ stat --format=%i aaa
2364513
me@debsrv:~/tmp/test$ stat --format=%i bbb
2374065

Thanks

--
Pastrano
con un altro account
--
https://mail.python.org/mailman/listinfo/python-list
Re: Symlinks already present [ In reply to ]
Il 28/07/20 02:50, Dennis Lee Bieber ha scritto:

> inode numbers apply for HARD LINKS

Thanks

--
Pastrano
con un altro account
--
https://mail.python.org/mailman/listinfo/python-list
Re: Symlinks already present [ In reply to ]
Il 27/07/20 20:37, Chris Angelico ha scritto:

> Unfortunately there's no real way to shortcut this if you just want to
> check one target directory. You'd still have to readlink() every
> symlink to try to find them.

Sorry for 10 days of delay (hardware problems at home). Yes, that is.
It's a mode to order directories from their content, but due the first
chars are always the same, and then I got a tree like

finaldir/f/firstpart/secondpart [changing_values]

and the test should be done only on a small list of links, so they're fast.

--
Pastrano
con un altro account
--
https://mail.python.org/mailman/listinfo/python-list
Re: Symlinks already present [ In reply to ]
On 2020-08-08 at 01:58:13 +0200,
Termoregolato <waste@is.invalid> wrote:

> me@debsrv:~/tmp/test$ stat --format=%i /home/me/mydir
> 18481153

Try ls -i. :-)
--
https://mail.python.org/mailman/listinfo/python-list
Re: Symlinks already present [ In reply to ]
On 27Jul2020 20:20, Termoregolato <waste@is.invalid> wrote:
>Il 26/07/20 20:39, Dennis Lee Bieber ha scritto:
>>Since symbolic links are essentially just short files containing the
>>path to the eventual target file/directory, with an OS flag that the file
>>is a link
>
>Yes, I use them massively to give to a lot of directories a kind of
>order, depending on their contents. It's simple to see if link is
>broken, but not if they're duplicate

Hmm. If you're scanning them all, you can at least cache the (dev,ino)
of the link target. So broken is stat-failed. Duplicate is
seen-this-(dev,ino)-before. You only need the stat, not to (for example)
resolve the path the symlink becomes.

You've probably thought of this already of cource.

Cheers,
Cameron Simpson <cs@cskk.id.au>
--
https://mail.python.org/mailman/listinfo/python-list
Re: Symlinks already present [ In reply to ]
On 27Jul2020 22:19, Grant Edwards <grant.b.edwards@gmail.com> wrote:
>On 2020-07-27, Termoregolato <waste@is.invalid> wrote:
>> Il 26/07/20 22:47, dn ha scritto:
>>> Thus, compare the results of the two calls to detect a difference.
>>
>> I will try also another way, If I don't err symlinks and original
>> directory have the same inode number (I talk about Linux, where I'm
>> using the application).
>
>You err. Symlinks are distinct i-nodes which are not the same i-node
>as the destination. A symlink is basically a file containing a string
>that is read and then used a path to another file.

We need to be careful with terminology (just for clarity).

Each "source" symlink has its own inode. But if you os.stat() the
symlink it follows the symlink and you get the inode for the "target"
directory - two symlinks which point at the same directory will return the same
inode and thus (st_dev,st_ino) in that stat result.

That can be used for comparison, and you don't need to readlink or
anything like that - let the OS do it all for you during the os.stat()
call.

>If you create a "hard" link (ln without the '-s') then you end up a single
>i-node that has entries in multiple directories.

Aye.

>[.old-Unix-guy story: Way back when, SunOS used to allow you (if root)
>to create a hard link to a directory. It's not something you did a
>second time.]

It's a well defined operation. There are some policy choices an OS can
make about some of the side effects (how does pwd work? how you got
there? or some underlying "real" path - this spills over into "what does
".." mean?), etc. But having made those choices, the idea is just fine.

As a counter example, many rsync based backup systems have the following
underlying approach:

- make a new directory tree with every file hardlinked from the previous
backup tree

- rsync into the new tree, because rsync unlinks and replaces changed
files

By contrast, MacOS Time Machine utilitises hardlinking directories on
HFS volumes: instead of making a new directory tree full of hardlinks
you just hardlink the top directory itself if nothing inside it has been
changed.

Cheers,
Cameron Simpson <cs@cskk.id.au>
--
https://mail.python.org/mailman/listinfo/python-list
Re: Symlinks already present [ In reply to ]
On Mon, Aug 31, 2020 at 1:17 PM Cameron Simpson <cs@cskk.id.au> wrote:
> Each "source" symlink has its own inode. But if you os.stat() the
> symlink it follows the symlink and you get the inode for the "target"
> directory - two symlinks which point at the same directory will return the same
> inode and thus (st_dev,st_ino) in that stat result.
>
> That can be used for comparison, and you don't need to readlink or
> anything like that - let the OS do it all for you during the os.stat()
> call.

Note that this is only the case if os.stat is called with
follow_symlinks=True, which is the default, but isn't the only way to
do things. And if you get stat results while you're iterating over a
directory, you don't follow symlinks.

> >[.old-Unix-guy story: Way back when, SunOS used to allow you (if root)
> >to create a hard link to a directory. It's not something you did a
> >second time.]
>
> It's a well defined operation. There are some policy choices an OS can
> make about some of the side effects (how does pwd work? how you got
> there? or some underlying "real" path - this spills over into "what does
> ".." mean?), etc. But having made those choices, the idea is just fine.

Is it well defined? Because of the ".." issue, it's not going to be as
symmetric as hardlinking files is. You can move a file by hardlinking
it and then unlinking the original name. If you do that with a
directory, at what point do you update its parent pointer? What
happens if you create TWO more hardlinks, and then unlink the original
name? Can you even *have* a single concept of a "real path" without it
basically just being symlinks in disguise?

BTW, the pwd issue actually isn't an issue, since it really *will* be
"how you got there". You can see that with modern systems if you have
symlinks in the path, or rename a directory:

rosuav@sikorsky:~/tmp$ mkdir -p a/b/c/d/e
rosuav@sikorsky:~/tmp$ cd a/b/c/d/e
rosuav@sikorsky:~/tmp/a/b/c/d/e$ mv ~/tmp/a/{b,q}
rosuav@sikorsky:~/tmp/a/b/c/d/e$ pwd
/home/rosuav/tmp/a/b/c/d/e
rosuav@sikorsky:~/tmp/a/b/c/d/e$ cd `pwd`
bash: cd: /home/rosuav/tmp/a/b/c/d/e: No such file or directory
rosuav@sikorsky:~/tmp/a/b/c/d/e$ ls -al
total 8
drwxr-xr-x 2 rosuav rosuav 4096 Aug 31 14:17 .
drwxr-xr-x 3 rosuav rosuav 4096 Aug 31 14:17 ..
rosuav@sikorsky:~/tmp/a/b/c/d/e$ cd ..
rosuav@sikorsky:~/tmp/a/q/c/d$ pwd
/home/rosuav/tmp/a/q/c/d
rosuav@sikorsky:~/tmp/a/q/c/d$

As soon as I try to go to the parent, it has to figure out what the
real path to that parent is. Otherwise, it's just the path that I
typed to get there - even though that might no longer be correct.
(There have been times, for instance, when I'm in a "dead" directory
and have to cd `pwd` to get back to the "real" directory with the same
name.)

The parent directory is crucially important here.

ChrisA
--
https://mail.python.org/mailman/listinfo/python-list
Re: Symlinks already present [ In reply to ]
On 31Aug2020 14:20, Chris Angelico <rosuav@gmail.com> wrote:
>On Mon, Aug 31, 2020 at 1:17 PM Cameron Simpson <cs@cskk.id.au> wrote:
>> Each "source" symlink has its own inode. But if you os.stat() the
>> symlink it follows the symlink and you get the inode for the "target"
>> directory - two symlinks which point at the same directory will return the same
>> inode and thus (st_dev,st_ino) in that stat result.
>>
>> That can be used for comparison, and you don't need to readlink or
>> anything like that - let the OS do it all for you during the os.stat()
>> call.
>
>Note that this is only the case if os.stat is called with
>follow_symlinks=True, which is the default, but isn't the only way to
>do things.

Maybe not, but it is the way I'm suggesting.

>> >[.old-Unix-guy story: Way back when, SunOS used to allow you (if
>> >root)
>> >to create a hard link to a directory. It's not something you did a
>> >second time.]
>>
>> It's a well defined operation. There are some policy choices an OS can
>> make about some of the side effects (how does pwd work? how you got
>> there? or some underlying "real" path - this spills over into "what does
>> ".." mean?), etc. But having made those choices, the idea is just fine.
>
>Is it well defined?

It can be well defined. Probably should have phrased it that way.

>Because of the ".." issue, it's not going to be as
>symmetric as hardlinking files is. You can move a file by hardlinking
>it and then unlinking the original name. If you do that with a
>directory, at what point do you update its parent pointer? What
>happens if you create TWO more hardlinks, and then unlink the original
>name? Can you even *have* a single concept of a "real path" without it
>basically just being symlinks in disguise?

Shrug. Who says ".." is wired to the directory, and not the user's
process context? Who says a wired to the directory ".." needs changing
at any time except when its referring link count goes to 1? There are
many choices here. Making those choices is a policy decision for the OS
implementor, and they all have their costs and benefits.

>BTW, the pwd issue actually isn't an issue, since it really *will* be
>"how you got there". You can see that with modern systems if you have
>symlinks in the path, or rename a directory: [...snip...]

Yeah, makes me ill. That's because these days "pwd" is usually a shell
builtin with funny semantics and a cache/sanity=check against $PWD
(which gets computed as you cd around, typically). And if has a -P
option and friends explicitly because of this hideous stuff.

Cheers,
Cameron Simpson <cs@cskk.id.au>
--
https://mail.python.org/mailman/listinfo/python-list
Re: Symlinks already present [ In reply to ]
On Mon, Aug 31, 2020 at 5:28 PM Cameron Simpson <cs@cskk.id.au> wrote:
> >Because of the ".." issue, it's not going to be as
> >symmetric as hardlinking files is. You can move a file by hardlinking
> >it and then unlinking the original name. If you do that with a
> >directory, at what point do you update its parent pointer? What
> >happens if you create TWO more hardlinks, and then unlink the original
> >name? Can you even *have* a single concept of a "real path" without it
> >basically just being symlinks in disguise?
>
> Shrug. Who says ".." is wired to the directory, and not the user's
> process context? Who says a wired to the directory ".." needs changing
> at any time except when its referring link count goes to 1? There are
> many choices here. Making those choices is a policy decision for the OS
> implementor, and they all have their costs and benefits.
>

Consider the situation I posed: start with one reference to the
directory, add two more, then remove the original. Where is its
parent? Is there any good way to handle that? And if you allow
hardlinking of directories at all, there's no reason to block this
particular sequence of operations. A naive reading of your description
is that the parent, in this situation, would remain unchanged - which
means the parent is some completely unrelated directory. Or, worse, it
could end up with a parent of itself, or a parent of its own child.

Are you SURE it can be well-defined?

ChrisA
--
https://mail.python.org/mailman/listinfo/python-list
Re: Symlinks already present [ In reply to ]
On 8/31/20 3:35 AM, Chris Angelico wrote:
> On Mon, Aug 31, 2020 at 5:28 PM Cameron Simpson <cs@cskk.id.au> wrote:
>>> Because of the ".." issue, it's not going to be as
>>> symmetric as hardlinking files is. You can move a file by hardlinking
>>> it and then unlinking the original name. If you do that with a
>>> directory, at what point do you update its parent pointer? What
>>> happens if you create TWO more hardlinks, and then unlink the original
>>> name? Can you even *have* a single concept of a "real path" without it
>>> basically just being symlinks in disguise?
>> Shrug. Who says ".." is wired to the directory, and not the user's
>> process context? Who says a wired to the directory ".." needs changing
>> at any time except when its referring link count goes to 1? There are
>> many choices here. Making those choices is a policy decision for the OS
>> implementor, and they all have their costs and benefits.
>>
> Consider the situation I posed: start with one reference to the
> directory, add two more, then remove the original. Where is its
> parent? Is there any good way to handle that? And if you allow
> hardlinking of directories at all, there's no reason to block this
> particular sequence of operations. A naive reading of your description
> is that the parent, in this situation, would remain unchanged - which
> means the parent is some completely unrelated directory. Or, worse, it
> could end up with a parent of itself, or a parent of its own child.
>
> Are you SURE it can be well-defined?
>
> ChrisA

EVERY  reference to the .. file link has to have a full path to that
link, either explicit with the reference of implicit via the current
working directory. That can define what is the parent. Yes, that says
that two references to the 'same' directory (same as in same inode, but
different paths) will find a different value for .. in it. So the
definition of .. can be well defined, even in the presence of multiple
parent directories.

--
Richard Damon

--
https://mail.python.org/mailman/listinfo/python-list
Re: Symlinks already present [ In reply to ]
On Mon, Aug 31, 2020 at 9:57 PM Richard Damon <Richard@damon-family.org> wrote:
>
> On 8/31/20 3:35 AM, Chris Angelico wrote:
> > On Mon, Aug 31, 2020 at 5:28 PM Cameron Simpson <cs@cskk.id.au> wrote:
> >>> Because of the ".." issue, it's not going to be as
> >>> symmetric as hardlinking files is. You can move a file by hardlinking
> >>> it and then unlinking the original name. If you do that with a
> >>> directory, at what point do you update its parent pointer? What
> >>> happens if you create TWO more hardlinks, and then unlink the original
> >>> name? Can you even *have* a single concept of a "real path" without it
> >>> basically just being symlinks in disguise?
> >> Shrug. Who says ".." is wired to the directory, and not the user's
> >> process context? Who says a wired to the directory ".." needs changing
> >> at any time except when its referring link count goes to 1? There are
> >> many choices here. Making those choices is a policy decision for the OS
> >> implementor, and they all have their costs and benefits.
> >>
> > Consider the situation I posed: start with one reference to the
> > directory, add two more, then remove the original. Where is its
> > parent? Is there any good way to handle that? And if you allow
> > hardlinking of directories at all, there's no reason to block this
> > particular sequence of operations. A naive reading of your description
> > is that the parent, in this situation, would remain unchanged - which
> > means the parent is some completely unrelated directory. Or, worse, it
> > could end up with a parent of itself, or a parent of its own child.
> >
> > Are you SURE it can be well-defined?
> >
> > ChrisA
>
> EVERY reference to the .. file link has to have a full path to that
> link, either explicit with the reference of implicit via the current
> working directory. That can define what is the parent. Yes, that says
> that two references to the 'same' directory (same as in same inode, but
> different paths) will find a different value for .. in it. So the
> definition of .. can be well defined, even in the presence of multiple
> parent directories.
>

That's incompatible with the normal meaning of "..", and it also
implies that any time you rename any directory, you have to scan all
of its children (recursively) to find any parent directory references
that need to change. I'm still not sure how this solves the problem -
it just pushes it to everything else, and you still have to have ".."
mean multiple things somehow.

ChrisA
--
https://mail.python.org/mailman/listinfo/python-list
Re: Symlinks already present [ In reply to ]
On 8/31/20 9:00 AM, Chris Angelico wrote:
> On Mon, Aug 31, 2020 at 9:57 PM Richard Damon <Richard@damon-family.org> wrote:
>> On 8/31/20 3:35 AM, Chris Angelico wrote:
>>> On Mon, Aug 31, 2020 at 5:28 PM Cameron Simpson <cs@cskk.id.au> wrote:
>>>>> Because of the ".." issue, it's not going to be as
>>>>> symmetric as hardlinking files is. You can move a file by hardlinking
>>>>> it and then unlinking the original name. If you do that with a
>>>>> directory, at what point do you update its parent pointer? What
>>>>> happens if you create TWO more hardlinks, and then unlink the original
>>>>> name? Can you even *have* a single concept of a "real path" without it
>>>>> basically just being symlinks in disguise?
>>>> Shrug. Who says ".." is wired to the directory, and not the user's
>>>> process context? Who says a wired to the directory ".." needs changing
>>>> at any time except when its referring link count goes to 1? There are
>>>> many choices here. Making those choices is a policy decision for the OS
>>>> implementor, and they all have their costs and benefits.
>>>>
>>> Consider the situation I posed: start with one reference to the
>>> directory, add two more, then remove the original. Where is its
>>> parent? Is there any good way to handle that? And if you allow
>>> hardlinking of directories at all, there's no reason to block this
>>> particular sequence of operations. A naive reading of your description
>>> is that the parent, in this situation, would remain unchanged - which
>>> means the parent is some completely unrelated directory. Or, worse, it
>>> could end up with a parent of itself, or a parent of its own child.
>>>
>>> Are you SURE it can be well-defined?
>>>
>>> ChrisA
>> EVERY reference to the .. file link has to have a full path to that
>> link, either explicit with the reference of implicit via the current
>> working directory. That can define what is the parent. Yes, that says
>> that two references to the 'same' directory (same as in same inode, but
>> different paths) will find a different value for .. in it. So the
>> definition of .. can be well defined, even in the presence of multiple
>> parent directories.
>>
> That's incompatible with the normal meaning of "..", and it also
> implies that any time you rename any directory, you have to scan all
> of its children (recursively) to find any parent directory references
> that need to change. I'm still not sure how this solves the problem -
> it just pushes it to everything else, and you still have to have ".."
> mean multiple things somehow.
>
> ChrisA

The . and .. entries in a directory don't need to be 'real' entries
added to the directory using up directory slots in the directory, but
pseudo entries created by the file system when reading a directory. To
read a directory, you need to specify it (how else do you say you want
to read it), and the meaning of . and .. can be derived from the path
used to read the directory.

And yes, this means that a given directory, reachable by multiple paths,
may give different values for .. (or .) based on which path you came to
it from.

--
Richard Damon

--
https://mail.python.org/mailman/listinfo/python-list
Re: Symlinks already present [ In reply to ]
On Tue, Sep 1, 2020 at 2:40 AM Richard Damon <Richard@damon-family.org> wrote:
>
> On 8/31/20 9:00 AM, Chris Angelico wrote:
> > That's incompatible with the normal meaning of "..", and it also
> > implies that any time you rename any directory, you have to scan all
> > of its children (recursively) to find any parent directory references
> > that need to change. I'm still not sure how this solves the problem -
> > it just pushes it to everything else, and you still have to have ".."
> > mean multiple things somehow.
> >
> > ChrisA
>
> The . and .. entries in a directory don't need to be 'real' entries
> added to the directory using up directory slots in the directory, but
> pseudo entries created by the file system when reading a directory. To
> read a directory, you need to specify it (how else do you say you want
> to read it), and the meaning of . and .. can be derived from the path
> used to read the directory.

You can open a directory (same as you open a file), and then you have
an open file descriptor. You can open something relative to something
else. And you can chroot in between those two operations, which would
mean that there is no complete path that references what you are
opening.

> And yes, this means that a given directory, reachable by multiple paths,
> may give different values for .. (or .) based on which path you came to
> it from.

That would basically violate the concept of hardlinks, which is that
they have the same content regardless of how you access them. What
you're suggesting is far better handled by symlinks.

ChrisA
--
https://mail.python.org/mailman/listinfo/python-list
Re: Symlinks already present [ In reply to ]
> On 31 Aug 2020, at 17:38, Richard Damon <Richard@Damon-Family.org> wrote:
>
> On 8/31/20 9:00 AM, Chris Angelico wrote:
>> On Mon, Aug 31, 2020 at 9:57 PM Richard Damon <Richard@damon-family.org> wrote:
>>> On 8/31/20 3:35 AM, Chris Angelico wrote:
>>>> On Mon, Aug 31, 2020 at 5:28 PM Cameron Simpson <cs@cskk.id.au> wrote:
>>>>>> Because of the ".." issue, it's not going to be as
>>>>>> symmetric as hardlinking files is. You can move a file by hardlinking
>>>>>> it and then unlinking the original name. If you do that with a
>>>>>> directory, at what point do you update its parent pointer? What
>>>>>> happens if you create TWO more hardlinks, and then unlink the original
>>>>>> name? Can you even *have* a single concept of a "real path" without it
>>>>>> basically just being symlinks in disguise?
>>>>> Shrug. Who says ".." is wired to the directory, and not the user's
>>>>> process context? Who says a wired to the directory ".." needs changing
>>>>> at any time except when its referring link count goes to 1? There are
>>>>> many choices here. Making those choices is a policy decision for the OS
>>>>> implementor, and they all have their costs and benefits.
>>>>>
>>>> Consider the situation I posed: start with one reference to the
>>>> directory, add two more, then remove the original. Where is its
>>>> parent? Is there any good way to handle that? And if you allow
>>>> hardlinking of directories at all, there's no reason to block this
>>>> particular sequence of operations. A naive reading of your description
>>>> is that the parent, in this situation, would remain unchanged - which
>>>> means the parent is some completely unrelated directory. Or, worse, it
>>>> could end up with a parent of itself, or a parent of its own child.
>>>>
>>>> Are you SURE it can be well-defined?
>>>>
>>>> ChrisA
>>> EVERY reference to the .. file link has to have a full path to that
>>> link, either explicit with the reference of implicit via the current
>>> working directory. That can define what is the parent. Yes, that says
>>> that two references to the 'same' directory (same as in same inode, but
>>> different paths) will find a different value for .. in it. So the
>>> definition of .. can be well defined, even in the presence of multiple
>>> parent directories.
>>>
>> That's incompatible with the normal meaning of "..", and it also
>> implies that any time you rename any directory, you have to scan all
>> of its children (recursively) to find any parent directory references
>> that need to change. I'm still not sure how this solves the problem -
>> it just pushes it to everything else, and you still have to have ".."
>> mean multiple things somehow.
>>
>> ChrisA
>
> The . and .. entries in a directory don't need to be 'real' entries
> added to the directory using up directory slots in the directory, but
> pseudo entries created by the file system when reading a directory. To
> read a directory, you need to specify it (how else do you say you want
> to read it), and the meaning of . and .. can be derived from the path
> used to read the directory.
>
> And yes, this means that a given directory, reachable by multiple paths,
> may give different values for .. (or .) based on which path you came to
> it from.

I'm intrigued.

How are you adding a second path that shows this mutating ".." ?
I tried with a symlink and that did not change the ".." inode.
Do you mean that I can do this with a bind mount?

Barry


>
> --
> Richard Damon
>
> --
> https://mail.python.org/mailman/listinfo/python-list <https://mail.python.org/mailman/listinfo/python-list>
--
https://mail.python.org/mailman/listinfo/python-list
Re: Symlinks already present [ In reply to ]
On 8/31/20 1:07 PM, Barry Scott wrote:
>
> I'm intrigued.
>
> How are you adding a second path that shows this mutating ".." ?
> I tried with a symlink and that did not change the ".." inode.
> Do you mean that I can do this with a bind mount?
>
> Barry
>
This is based on a hypothetical OS that allows creating hard-links to
directories, just like to files. Because current *nix system don't do it
this way, they don't allow hard-links to directories because it does
cause this sort of issue.

--
Richard Damon

--
https://mail.python.org/mailman/listinfo/python-list
Re: Symlinks already present [ In reply to ]
On 8/31/20 12:49 PM, Chris Angelico wrote:
> On Tue, Sep 1, 2020 at 2:40 AM Richard Damon <Richard@damon-family.org> wrote:
>> On 8/31/20 9:00 AM, Chris Angelico wrote:
>>> That's incompatible with the normal meaning of "..", and it also
>>> implies that any time you rename any directory, you have to scan all
>>> of its children (recursively) to find any parent directory references
>>> that need to change. I'm still not sure how this solves the problem -
>>> it just pushes it to everything else, and you still have to have ".."
>>> mean multiple things somehow.
>>>
>>> ChrisA
>> The . and .. entries in a directory don't need to be 'real' entries
>> added to the directory using up directory slots in the directory, but
>> pseudo entries created by the file system when reading a directory. To
>> read a directory, you need to specify it (how else do you say you want
>> to read it), and the meaning of . and .. can be derived from the path
>> used to read the directory.
> You can open a directory (same as you open a file), and then you have
> an open file descriptor. You can open something relative to something
> else. And you can chroot in between those two operations, which would
> mean that there is no complete path that references what you are
> opening.
The file descriptor could remember the path used to get to it. chroot
shows that .. needs to be somewhat special, as it needs to go away for
anyone that . is their current root.
>
>> And yes, this means that a given directory, reachable by multiple paths,
>> may give different values for .. (or .) based on which path you came to
>> it from.
> That would basically violate the concept of hardlinks, which is that
> they have the same content regardless of how you access them. What
> you're suggesting is far better handled by symlinks.
>
> ChrisA

I see no problem with it being a hardlink, and in fact, executables know
the name they were executed by, so directories  knowing the path isn't
that different. The key differnce between a hardlink and a symlink is
that hardlinks maintain existance, and always point to something that
exists (things know how many hardlinks refer to them). symlinks don't
reference the actual file object, but the symbolic path to it, which may
or may not actually exist, and who doesn't know such a link exists.

--
Richard Damon

--
https://mail.python.org/mailman/listinfo/python-list

1 2  View All