Friday, November 9, 2007


Implementing realpath() in C++



Before reading this post, please read the background in the previous post PATH_MAX simply isn't, as it covers important background, and has some code we'll be building on over here.

Here's what realpath() is supposed to do:

realpath() expands all symbolic links and resolves references to ’/./’, ’/../’ and extra ’/’ characters in the null terminated string named by path and stores the canonicalized absolute pathname in the buffer of size PATH_MAX named by resolved_path. The resulting path will have no symbolic link, ’/./’ or ’/../’ components.


Instead of attacking implementing such a thing head on, we'll take a bottom up approach. One should consider that being able to break down a path between the path components prior to the last slash and after the last slash is essential to figuring out which directory the file actually lies.

The built in functions for this are called dirname() and basename(), but they modify their arguments, and they do some funny handling in special cases which we don't want.

So to implement our own for our purposes, we can write a function as follows:

void relative_dir_base_split(const std::string& path, std::string& dir, std::string& base)
{
std::string::size_type slash_pos = path.rfind("/"); //Find the last slash
if (slash_pos != std::string::npos) //If there is a slash
{
slash_pos++;
dir = path.substr(0, slash_pos); //Directory is before slash
base = path.substr(slash_pos); //And obviously, the file is after
}
else //Otherwise, there is no directory present
{
dir.clear();
base = path;
}
}

The function is simple enough, pass it a path, and two C++ strings for the return values, and have it split them up.

The next function which we'll use is one to get the realpath() of a directory. We can easily get such a thing by chdir()'ing to the directory, running our getcwd(), and then chdir() back.
This simple function is as follows:


bool chdir_getcwd(const std::string& dir, std::string& path)
{
bool success = false;
int start_fd = open(".", O_RDONLY); //Open current directory so we can save a handle to it
if (start_fd != -1)
{
if (!chdir(dir.c_str())) //Change to directory
{
success = getcwd(path); //And get its path
fchdir(start_fd); //And change back of course
}
close(start_fd);
}
return(success);
}

This simple function to resolve where a directory is located uses should be obvious, however, it's also a core function in figuring our where a file is, since a file is located at the realpath() of its parent directory, plus the base name for the file itself.

Now to get the realpath() of a file, we have this simple wrapper over the above functions:

static inline bool realpath_file(const std::string& path, std::string& resolved_path)
{
bool success = false;
std::string dir;
std::string base;
relative_dir_base_split(path, dir, base);

//If there is a directory, get the realpath() for it, otherwise the current directory
if (dir.size() ? chdir_getcwd(dir, resolved_path) : getcwd(resolved_path))
{
resolved_path += base;
success = true;
}
return(success);
}

Now we have chdir_getcwd() which is basically a realpath() for a directory, and realpath_file() for files. Note however, that if you call realpath_file() directly, it won't actually ensure that the file exists, so probably a bad idea to call directly, unless you don't care if it exists or not. It also won't handle symlinks. To handle symlinks we need quite a bit more code.

First, let us write a wrapper in C++ for reading links:

bool readlink_internal(const std::string& path, std::string& buffer, ssize_t length)
{
bool success = false;
if (length > 0)
{
char *buf = new(nothrow) char[length+1]; //Room for Null
if (buf)
{
ssize_t amount = ::readlink(path.c_str(), buf, length+1); //Give room for failure
if ((amount > 0) && (amount <= length)) //If > length, it was modified mid check
{
buf[amount] = 0;
buffer = buf;
success = true;
}
delete[] buf;
}
}
return(success);
}

This function simply wraps up the built in one, and uses C++ strings for dynamic allocation, so we don't have to worry about it directly in our function to resolve symlinks.

Before diving into the actual details in resolving a symlink, let me present one more helper function:

void build_path_base_swap(std::string &path, const std::string& newbase)
{
string dir;
string base;
relative_dir_base_split(path, dir, base);

if (dir.size())
{
path = dir + newbase;
}
else
{
path = newbase;
}
}


This above function will take a path, and replace the base part of it, meaning if you pass it "path/cows.txt" and "feet.txt", path will become "path/feet.txt".

Now onto the symlink discussion.

Resolving symlinks in itself is hard though, since one symlink could lead to another, meaning we'll need a loop. We also need to defend ourselves against an infinite loop if one symlink links to itself, or a symlink earlier on in our resolve loop.

It is not safe enough to call a regular stat() to check for the final link existing (and not a loop), since during our resolve, an attacker can modify a link, so we have to detect that.
Unfortunately, most OSs/libraries have a define called MAXSYMLINKS which I've seen set anywhere from 8 to 30. They stop their loop once they've iterated MAXSYMLINKS times and return an errno of ELOOP. Unfortunately, doing such has two flaws, which I hope should be apparent. First of all, if a link links to itself immediately, it would take MAXSYMLINKS passes until the function finally realizes it (I'm looking at you OpenBSD). The other issue is that it's just wrong.
Hopefully any of us who has taken a data structures course would know that there are algorithms for detecting a loop in a linked list, and not have to resort to any dumb method which doesn't actually detect a loop. Also a number such as 8 is way too low which I've seen some do. I look at one of my Linux boxes, and I see some binaries are actually symlinked a dozen times. Like "upx" is symlinked to "upx-3.00", which is symlinked to "upx-3.00-nonfree", which is symlinked to, etc...

The optimal way to detect a loop in a linked list is explained here, see the last paragraph for the optimal O(N) solution. Unfortunately though, the optimal solution means we need to create a second iterator to check on symlinks, resolving links ahead of the main loop, two at a time. While great in memory, it can be a pain on a hard drive, where we have to deal with slow access, and the links can change during the check. Therefor I settled on a variation of the first solution, where we'll keep track of the file IDs (described in the previous post) of each symlink, and see if we ever hit a match. We'll store this data in an std::set, which is usually implemented as a Red & Black Tree which has O(log(N)) time for insertion and searching. Making our algorithm work at O(log(N))*2 (since we insert and find each time) * N, which reduces to O(N log(N)).

Here's the code:

bool symlink_resolve(const std::string& start, std::string& end)
{
typedef std::pair<dev_t, ino_t> file_id;

bool success = false;
if (start.size())
{
std::string path = start; //Need a modifyable copy
struct stat sb;
std::set<file_id> seen_links;

bool resolved_link;
do //The symlink resolve loop
{
resolved_link = false;
if (!lstat(path.c_str(), &sb))
{
file_id current_id(sb.st_dev, sb.st_ino);
if (seen_links.find(current_id) == seen_links.end()) //Not a link we've seen
{
seen_links.insert(current_id); //Add to our set

if (S_ISLNK(sb.st_mode)) //Another link
{
std::string newpath;
if (readlink_internal(path, newpath, sb.st_size))
{
if (newpath[0] == '/') //Absolute
{
path = newpath;
}
else //We need to calculate the relative path in relation to the current
{
build_path_base_swap(path, newpath);
}
resolved_link = true;
} //Else, Link can't be read, time to quit
}
else //Yay, it's not a link! got to the last part finally!
{
success = realpath_file(path, end);
}
} //Else, Nice try, someone linked a link back into a previous link during the scan to try to trick us into an infinite loop
} //Else, Dangling link, can't resolve
} while (resolved_link);
}
return(success);
}


And now with all those helper functions out of the way, we can finally write the actual realpath() function itself, and it's a heck of a lot simpler than other implementations you'll find out there.

Here it is:

bool realpath(const std::string& path, std::string& resolved_path, bool resolve_link = true)
{
bool success = false;
if (path.size())
{
struct stat sb;
if (!stat(path.c_str(), &sb))
{
bool (*rp)(const std::string&, std::string&) = resolve_link ? symlink_resolve : realpath_file;
success = S_ISDIR(sb.st_mode) ? chdir_getcwd(path, resolved_path) : rp(path, resolved_path);
}
}
return(success);
}

It simply stat()s then runs chdir_getcwd() on directories and symlink_resolve() on files. I also offer an added feature. If you don't want the file name itself resolved, which is quite often what people want when a program saves a new file based off a name of another files passed to it, you can pass false as the third parameter, and it'll call realpath_file() instead of symlink_resolve(). With this wrapper, realpath_file() also has a stat() call before it, so the file is checked for existence like the built in realpath().

This implementation protects against the issues of PATH_MAX as described in our previous posts. It's also written in C++ to avoid buffer issues. Unlike built in ones, we generally have faster realpath() on directory, since it uses a getcwd() technique, while most other ones out there use a complicated set of steps on every component in the path until the last one is reached, which is especially slow if there are symlinks in the path. We also beat the built in ones regarding speed in cases of link to self, and have support for a deeper set of links. Some implementations also have some fixed sized buffer which can easily get full, and return failure, which is annoying, but our implementation won't run into. Lastly, we offer not resolving the links at all on the last component.

Drawbacks to our implementation is that it won't work on Windows as described in our last post. But you can #ifdef for that, and easily enough wrap around fullpath(). It also isn't thread safe due to the use of chdir() (which many built in implementations also do), see the previous post on how to work around that. Next time, we'll discuss methods to do all this without a chdir() at all, thus ensuring thread safety without any headaches.

40 comments:

Danny said...

I just ran into the PATH_MAX problem - actually, I ran into a buggy work around - and wanted a quick realpath replacement. Google brought me to you. What's the license on your code? Were the AT flags ever standardized?

insane coder said...

Just place a link back to this article in whatever file contains the code, and use it however you wish.

The *at() functions are now part of POSIX 2008.

Also, I recommend understanding the code here, and tweaking as needed. I myself have overhauled this code since I wrote this article, and use that in my projects.

Unknown said...

It looks like the stat() in realpath can err for reasons you specified were unreasonable.

You need to test for ELOOP and ENAMETOOLONG because those may not be valid errors.

KARTHIK said...

Great blog !It is best institute.Top Training institute In chennai
http://chennaitraining.in/oracle-dba-training-in-chennai/
http://chennaitraining.in/sql-server-dba-training-in-chennai/
http://chennaitraining.in/teradata-training-in-chennai/
http://chennaitraining.in/sap-hr-training-in-chennai/
http://chennaitraining.in/sap-fico-training-in-chennai/
http://chennaitraining.in/sap-abap-training-in-chennai/

bavisthra said...

Study Data Analytics Course in Bangalore with ExcelR where you get a great experience and better knowledge.
Data Analytics Course in Bangalore

Augustine Christian said...

Awesome article, it was exceptionally helpful! I simply began in this and I'm becoming more acquainted with it better. The post is written in very a good manner and it contains many useful information for me. Thank you very much and will look for more postings from you.

digital marketing blog
skartec's digital marketing blog
skartec digital marketing academy
skartec digital marketing
best seo service in chennai
best seo services in chennai
high pr dofollow directory submission sites
high da dofollow directory submission sites

onlinepill said...


When you feel any kind of body pain, it is best if you go to the doctor for treating it.
Sometimes body pain can be the symptom of some serious disease. Sometimes body pain attacks
us suddenly because of which you may not able to get the help of the doctor. In those situations,
to get quick and effective pain relief, you can take the help of painkillers though they cannot cure your pain.
As your painkiller, choose Tramadol 50 mg which is very effective. This painkiller is available in
the market with the name of Ultram. To use this painkiller, you can get it easily.
Buy Tramadol online and get this painkiller at an affordable price.

tramadol 50 mg
tramadol 50 mg hcl
tramadol 50 mg hydrochloride
tramadol 50 mg tablet

Ethan jurk said...

On the off chance that you have stuck some place with the installation, or unfit to set up the printer then you are at the perfect spot,Wireless Printer Setup Canon

augustwalker said...

HP Print And Scan Doctor Utility is outstanding amongst other apparatus offered by HP to oversee printing and checking task. HP print and sweep specialist encourages you to investigate your printer gadget. This device can without much of a stretch work on the two Windows and Mac framework. In the event that you have any issue while downloading and introducing this device, contact specialists right away. You will get total answer for fix the mistakes identified with print and output specialist.

onlinepill said...

Being a member of the Association of Indian Universities (AIU), the programs are recognized by WES. Students may verify the same from the WES platform for Canadian Immigration. NMIMS Distance Education program serves its students with highly innovative and revolutionary technology and offers every digital solution to enable faster and most tactful learning process to its students.
nmims distance fee payment
nmims distance university
nmims college distance education
nmims distance result
nmims online distance learning

DY Patil Distance Learning program is aimed towards a slick and rapid improvement in the process of education while maintaining the contemporary standards of the educational industry in the genre of Hospitality and Management. Being renowned as a sophisticated university, DY Patil has several tie-ups with some of the best international industries.
dr dy patil vidyapeeth pune distance mba
ajeenkya dy patil university distance education
dy patil distance education mumbai
dy patil university distance education
dy patil institute of distance learning admission 2020

The School of Distance Education and Learning has executed the same schooling pattern from its mother university and continues to spread its remarkable influence across Rajasthan. The students and alumni members are provided with innovative self-learning study materials. JNU Distance Learning also offers consistent and well-organized counselling programs to its JNU Distance Learning students by renowned and experienced counsellors right at their doorsteps.
jaipur national university distance education bca question paper

jaipur national university distance education contact details

jaipur national university distance education login

jaipur national university distance education ba

\jaipur national university distance education results june 2020

CloudLearn ERP said...

I have honestly never read such overwhelmingly good content like this. I agree with your points and your ideas. This info is really great. Thanks.
Best Data Science training in Mumbai

Data Science training in Mumbai

ajay said...

thanks for sharing us

Hindi Shayari

Hindi Shayari

hindi love shayari 2020

Rap Name Generator

Mod Apk

Picsart Mod Apk

Kinemaster pro Mod Apk

mp4moviez

Latest News

Best laptop under 50000

hindi News

Thanks For Sharing Useful Content

onlinepill said...

Bachelor of Pharmacy (B.Pharm) is an undergraduate degree course in the field of Pharmacy education. The students those are interested in the medical field (except to become a doctor) can choose this course after the completion of class 12th (PCM/B). After the completion of this degree, the students can practice as a Pharmacist. The duration of this course is 4 Total years.
soma 350mg dosage
bpharm
bachelor of pharmacy
best bachelor of pharmacy college
bachelor of pharmacy admission
bpharm colleges
bpharm Admission
best college for bpharm
bpharm mumbai
bachelor of pharmacy admission mumbai
bachelor of pharmacy mumba
bpharm admissioon mumbai

Priyanka said...

Attend The Data Analytics Courses From ExcelR. Practical Data Analytics Courses Sessions With Assured Placement Support From Experienced Faculty. ExcelR Offers The Data Analytics Courses.
Data Analytics Courses

Anirban Ghosh said...

Wow! This article is jam-packed full of useful information. The points made here are clear, concise, readable and poignant. I honestly like your writing style.
SAP training in Kolkata
SAP training Kolkata
Best SAP training in Kolkata
SAP course in Kolkata
SAP training institute Kolkata

onlinepill said...

The issue of erectile dysfunction can be more complicated at times than we usually think. No men would want to suffer from erectile dysfunction. It not just affects them physically but also leaves a deep impact on the minds of the person.
Viagra 100 mg
viagra 100 mg price
Generic Viagra 100 mg
Viagra 100 mg tablet price
Erectile dysfunction is a condition where a man is not able to get an erection. Even if they are able to get an erection, it does not last for too long. Suffering from erectile dysfunction can affect the person both physically and mentally. Therefore a person needs to take medical help to make the condition better. Also suffering from erectile dysfunction can affect the relation of the person with their partners.
Viagra 200 mg
viagra 200 mg price
Generic Viagra 200 mg
Viagra 200 mg tablet price
Erectile dysfunction or impotence is a health issue that is suffered by men. In this problem, the man becomes unable to get the perfect erection. Irregular blood flow in the body is the cause of this problem. This problem may happen because of various reasons.
Viagra 150 mg
viagra 150 mg price
Generic Viagra 150 mg
Viagra 150 mg tablet price

Al Amin said...

nice post.

https://www.azlifehacks.com/2020/06/%20Lose%20Weight%20%20Lose%20Belly%20Fat%20%20How%20To%20Lose%20Belly%20Fat.html

augustwalker said...

garmin gps update - Garmin Update - Garmin Nivi Update And download Garmin Express Easy step to upadate Garmin Device with pc

Priyanka said...

Attend The Data Analyst Course From ExcelR. Practical Data Analyst Course Sessions With Assured Placement Support From Experienced Faculty. ExcelR Offers The Data Analyst Course.
Data Analyst Course

MBBS in Philippines said...

UV GULLAS COLLEGE OF MEDICINE is one of Top Medical College in Philippines in Cebu city. International students have the oppertunity to study medicine in phillipines at affordable cost and world class University. The college has successful alumni who have achieved well in the fields of law, business, politics, academe, medicine, sports and other endeavors. At University of the Visayas, we prepare students for a global competition.

Direct MBBS Admissions Open: 2020-21
Mobile No: +91 90329 55688
Apply Now: https://www.careerplus.org.in/philippines-medical-college/uv-gullas-college-of-medicine

vivekvedha said...

i used to be looking for this certain page....acte velachery reviews

acte tambaram reviews

acte anna nagar reviews

acte porur reviews

acte omr reviews

acte chennai reviews

acte student reviews

ek said...

Just saying thanks will not just be sufficient, for the fantasti c lucidity in your writing. I will instantly grab your rss feed to stay informed of any updates.<a href="https://www.excelr.com/business-analytics-training-in-pune/”> Courses in Business Analytics</a>

Aishu said...

Wow! Such an amazing and helpful blog.I really loved it. I hope that you will continue to do your work like this in future also.
IELTS Coaching in chennai

German Classes in Chennai

GRE Coaching Classes in Chennai

TOEFL Coaching in Chennai

spoken english classes in chennai | Communication training


Joyal said...

Good post and informative. Thank you very much for sharing this good article, it was so good to read and useful to improve my knowledge as updated, keep blogging. Thank you for sharing wonderful information with us to get some idea about that content.
oracle training in chennai

oracle training institute in chennai

oracle training in bangalore

oracle training in hyderabad

oracle training

oracle online training

hadoop training in chennai

hadoop training in bangalore

dhinesh said...

Great site and a great topic as well I really get amazed to read this.This is incredible,I feel really happy to have seen your webpage.I gained many unknown information, the way you have clearly explained is really fantastic.keep posting like this information.
Full Stack Training in Chennai

Full Stack Course Chennai
Full Stack Training in Bangalore

Full Stack Course in Bangalore

Full Stack Training in Hyderabad

Full Stack Course in Hyderabad

Full Stack Training

Full Stack Course

Full Stack Online Training

Full Stack Online Course

ramesh said...

I simply want to mention I am just all new to blogging and site-building and truly loved you’re web page. Almost certainly I’m planning to bookmark your site . You really have outstanding stories. Many thanks for revealing your webpage.…
Azure Training in Chennai

Azure Training in Bangalore

Azure Training in Hyderabad

Azure Training in Pune

Azure Training | microsoft azure certification | Azure Online Training Course

Azure Online Training

rocky said...

Very interesting blog. Many blogs I see these days do not really provide anything that attracts others, but believe me the way you interact is literally awesome.You can also check my articles as well.python training in bangalore

python training in hyderabad

python online training

python training

python flask training

python flask online training

python training in coimbatore
python training in chennai

python course in chennai

python online training in chennai

vijay said...

I simply want to mention I am just all new to blogging and site-building and truly loved you’re web page. Almost certainly I’m planning to bookmark your site . You really have outstanding stories. Many thanks for revealing your webpage.
Salesforce Training in Chennai

Salesforce Online Training in Chennai

Salesforce Training in Bangalore

Salesforce Training in Hyderabad

Salesforce training in ameerpet

Salesforce Training in Pune

Salesforce Online Training

Salesforce Training

ek said...

Your work is very good, and I appreciate you and hopping for some more informative posts
<a href="https://www.excelr.com/business-analytics-training-in-pune/”> Courses in Business Analytics</a>
It is perfect time to make some plans for the future and it is time to be happy. I’ve read this post and if I could I desire to suggest you few interesting things or tips. Perhaps you could write next articles referring to this article. I want to read more things about it!

onlinepill said...


To get generic Viagra without a prescription, you can order Viagra online. In most of the online stores, you don’t have to show any prescription for buying medicines. You can buy Viagra 150 mg online from onlinepillsmart.com without any prescription.
Generic Viagra
tablet Generic Viagra 150mg
Generic Viagra 150mg
150 mg Generic Viagra

onlinepill said...


Nuvigil 150 mg is a smart drug that is effective in keeping away the mental fatigue. Mental fatigue can occur due to physical as well as emotional symptoms. This puts an impact on the person’s behavior, that others may notice even before the person feels.
Nuvigil 150mg
tablet Nuvigil 150mg
Nuvigil
150mg Nuvigil

Neeta Borse said...

This blog is very helpful information thank you so much for sharing this.
Php projects with source code
Online examination system in php
Student management system in php
Php projects for students
Free source code for academic
Academic projects provider in nashik
Academic project free download

Nino Nurmadi , S.Kom said...

ninonurmadi.com
ninonurmadi.com
ninonurmadi.com
ninonurmadi.com
Nino Nurmadi, S.Kom
Nino Nurmadi, S.Kom
Nino Nurmadi, S.Kom
Nino Nurmadi, S.Kom
Nino Nurmadi, S.Kom

antony789 said...

Superb blog post! And this blog clearly explain about for useful information. I would Thanks for sharing this wonderful content. Keep it up!

Software Testing Training in Chennai

Software Testing Online Training in Chennai

Software Testing Courses in Chennai

Software Testing Training in Bangalore

Software Testing Training in Hyderabad

Software Testing Training in Coimbatore

Software Testing Training

Software Testing Online Training

antony789 said...

Superb blog post! And this blog clearly explain about for useful information. I would Thanks for sharing this wonderful content. Keep it up!

Software Testing Training in Chennai

Software Testing Online Training in Chennai

Software Testing Courses in Chennai

Software Testing Training in Bangalore

Software Testing Training in Hyderabad

Software Testing Training in Coimbatore

Software Testing Training

Software Testing Online Training

ek said...

Your work is very good, and I appreciate you and hopping for some more informative posts
<a href="https://www.excelr.com/business-analytics-training-in-pune/”> Courses in Business Analytics </a>
I am always searching online for articles that can help me. There is obviously a lot to know about this. I think you made some good points in Features also. Keep working, great job !

ek said...

<a href="https://www.excelr.com/business-analytics-training-in-pune/”> Business Analytics Courses </a>
I am always searching online for articles that can help me. There is obviously a lot to know about this. I think you made some good points in Features also. Keep working, great job !

Zubair Ahmed said...

if you want to learn SEO and have no idea about
Backlinks,
on page seo and
off page seo then please keep connect withus for
complete guideline.

ek said...

Your work is very good, and I appreciate you and hopping for some more informative posts
<a href="https://www.excelr.com/business-analytics-training-in-pune/”> Courses in Business Analytics </a>
I am always searching online for articles that can help me. There is obviously a lot to know about this. I think you made some good points in Features also. Keep working, great job !

jazz internet said...

lyzafashion is the name of a fashion icon.
You can find the latest fashion trending outfits with a full range that suitable for all communities. lyzafashion deal bodycon, casual, Day, Evening, Maxi, Mini Shift, and Skater dress.

Best Carpet Cleaner: We are professional to deal
with Best Carpet Cleaner of all ranges. we are dealing with top-rated quality brands of cleaner like Hoover, Doctor Rug, Shark, Bissell.