In python how do you get the "last" directory in a path string?

118 views Asked by At

I am working on a remote file system, where I don't have direct access to the files/directories, so I cannot check if a string represents a file or a directory.

I have the following paths I need to handle, where I have to get a hold of the "partition column":

path1 = "/path/to/2012-01-01/files/2014-01-31/la.parquet"
path2 = "/path/to/2012-01-01/files/2014-01-31/"
path3 = "/path/to/2012-01-01/files/2014-01-31"

In all cases, the deepest path (partition column) is "2014-01-31". Is there a way consistently to get this path in a single line of code, or do I have to do all sorts of checks of file names?

I was hoping to do something like:

import os
os.path.dirname(path).split("/")[-1]

But this doesn't work for path3. Does one need to have access to the filesystem to correctly identify the deepest directory, or is there some easy way?

5

There are 5 answers

2
Yevhen Kuzmovych On BEST ANSWER

Technically la.parquet is a valid directory name, so there's no way to tell just from the string, you'll need to introduce some manual logic. E.x. check for '.' in the name.

>>> import pathlib
>>> p = pathlib.Path(path1)
>>> p.parent.name if '.' in p.name else p.name
'2014-01-31'
>>> p = pathlib.Path(path2)
>>> p.parent.name if '.' in p.name else p.name
'2014-01-31'
>>> p = pathlib.Path(path3)
>>> p.parent.name if '.' in p.name else p.name
'2014-01-31'

You can be more precise (e.x. check '.parquet' in p.name) if needed.

2
onlinejudge95 On

This can be solved using pathlib. Here is a straightforward solution for the same.

import pathlib


path1 = pathlib.Path("/path/to/2012-01-01/files/2014-01-31/la.parquet")
print(path1.parent.name if not path1.is_dir() else path1.name)

[EDIT]:- Handling cases when the last path can be a directory or a file

Reference

3
olizimmermann On

Probably something like this:

pathlib.Path(path).parent if not os.path.isdir(path) else pathlib.Path(path)

enter image description here

0
Jatinder Kumar On

You can use following code to get last directory name:

import os
path = "/path/to/your/directory"
last_directory = os.path.basename(os.path.dirname(path))
print(last_directory)

You have to make sure directory name is post fixed by directory separator, otherwise it will be considered as file.

0
Esben Eickhardt On

I expected there to be a function like "basename" to resolve this, but I guess as @Jatinder says, there is no way to differentiate between "directory name" and "filename without extension" without having access to the filesystem.

I ended up just using regex for my specific use-case, and I guess if example 3 was left out there would be a general solution checking for strings ending with "/".

Here my solution:

import re

# Examples
path1 = "/path/to/2012-01-01/files/2014-01-31/la.parquet"
path2 = "/path/to/2012-01-01/files/2014-01-31/"
path3 = "/path/to/2012-01-01/files/2014-01-31"

# Find match
for path in [path1, path2, path3]:
   print(re.search(r'.*(\d{4}-\d{2}-\d{2})', path).group(1))