Sunday, November 11, 2007


Directory safety when working with paths



As we discussed in our previous two articles, we run into issues of thread safety when we try changing the path during a running program. Therefor, in addition to bypassing PATH_MAX problems, working around buffer issues thanks to implementing with C++, we would also love to implement our functions without ever calling chdir() and changing the working directory. We'll now look into how we can make changes to our implementation to avoid a chdir() call, along with the pros and cons of techniques UNIX operating systems provide us, as well as what they're lacking.

If one looks back at how we implemented getcwd(), they'll notice that we call chdir("..") for each step in our loop. We need this to sanitize the location for these three calls:

opendir(".")
lstat(entry->d_name, &sb)
stat(".", &sb)


To remove the need of changing directories, we can place the ".." paths directly into our calls to opendir(), lstat(), and stat(). Thanks to C++ Strings being dynamic, it makes it easy for us to manipulate paths for these situations properly. We'll also see that being clever, we can even avoid string manipulation for some of these cases. And if our OSs supplied enough functionality, without any string manipulation at all!

Here's how we can implement getcwd() based off our previous implementation, but without calls to chdir(), with key details highlighted:

bool getcwd(std::string& path)
{
typedef std::pair<dev_t, ino_t> file_id;

bool success = false;
struct stat sb;
if (!stat(".", &sb))
{
file_id current_id(sb.st_dev, sb.st_ino);
if (!stat("/", &sb)) //Get info for root directory, so we can determine when we hit it
{
std::vector<std::string> path_components;
file_id root_id(sb.st_dev, sb.st_ino);
std::string up_path("..");

while (current_id != root_id) //If they're equal, we've obtained enough info to build the path
{
bool pushed = false;
DIR *dir = opendir(up_path.c_str());
if (dir)
{
dirent *entry;
up_path += "/";
std::string::size_type after_slash = up_path.size();
while ((entry = readdir(dir))) //We loop through each entry trying to find where we came from
{
if (strcmp(entry->d_name, ".") && strcmp(entry->d_name, ".."))
{
up_path.replace(after_slash, std::string::npos, entry->d_name);
if (!lstat(up_path.c_str(), &sb))
{
file_id child_id(sb.st_dev, sb.st_ino);
if (child_id == current_id) //We found where we came from, add its name to the list
{
path_components.push_back(entry->d_name);
pushed = true;
break;
}
}
}
}

if (pushed && !fstat(dirfd(dir), &sb)) //If we have a reason to contiue, we update the current dir id
{
current_id = file_id(sb.st_dev, sb.st_ino);
up_path.replace(after_slash, std::string::npos, ".."); //Keep recursing towards root each iteration
}

closedir(dir);
}
if (!pushed) { break; } //If we didn't obtain any info this pass, no reason to continue
}

if (current_id == root_id) //Unless they're equal, we failed above
{
//Built the path, will always end with a slash
path = "/";
for (std::vector<std::string>::reverse_iterator i = path_components.rbegin(); i != path_components.rend(); ++i)
{
path += *i+"/";
}
success = true;
}
}
}

return(success);
}

First of all, we removed all the code related to saving the current directory when we started, since we don't need that information anymore. Our first major change is that we created a new C++ String called up_path, we'll use this variable to keep track of the path, initializing it to "..". then for each step through the loop, make it "../..", "../../.." and so on, till we reach the root. We use this to replace our calls to opendir() to "." as we were doing before.
At this point, we'll add a slash to the path, and keep track of the spot with the variable after_slash. Now in our read directory loop, we can replace whatever is after the slash with the filename in the directory to pass to lstat(), again bypassing the need to be in the same directory as the file when making the function call.
Now for the stat() call on the directory itself, we got a little interesting. Instead of doing a path manipulation trick again, we call fstat() on the file descriptor returned from dirfd() on the already open directly handle. Notice how the call to close directory has been moved to after the block of code, so the directory is still open. And of course it's all wrapped up nicely with appending ".." after the slash.

Noticing how we eliminated path manipulation for the last part, it would be nice to eliminate more of it, especially if anyone wants to port this C++ code to C. The good news is that on Solaris and Linux we can, and as soon as the "at" functions get standardized, we can use them on the other UNIX OSs too. You can read more about them in one of our previous articles, File Descriptors and why we can't use them.

Here's how we can use the at functions to eliminate the path manipulation needed for calling lstat(), with the differences highlighted:

bool getcwd(std::string& path)
{
typedef std::pair<dev_t, ino_t> file_id;

bool success = false;
struct stat sb;
if (!stat(".", &sb))
{
file_id current_id(sb.st_dev, sb.st_ino);
if (!stat("/", &sb)) //Get info for root directory, so we can determine when we hit it
{
std::vector<std::string> path_components;
file_id root_id(sb.st_dev, sb.st_ino);
std::string up_path("..");

while (current_id != root_id) //If they're equal, we've obtained enough info to build the path
{
bool pushed = false;
DIR *dir = opendir(up_path.c_str());
if (dir)
{
int dir_fd = dirfd(dir);
dirent *entry;
while ((entry = readdir(dir))) //We loop through each entry trying to find where we came from
{
if (strcmp(entry->d_name, ".") && strcmp(entry->d_name, "..") && !fstatat(dir_fd, entry->d_name, &sb, AT_SYMLINK_NOFOLLOW))
{
file_id child_id(sb.st_dev, sb.st_ino);
if (child_id == current_id) //We found where we came from, add its name to the list
{
path_components.push_back(entry->d_name);
pushed = true;
break;
}
}
}

if (pushed && !fstat(dir_fd, &sb)) //If we have a reason to contiue, we update the current dir id
{
current_id = file_id(sb.st_dev, sb.st_ino);
up_path += "/.."; //Keep recursing towards root each iteration
}

closedir(dir);
}
if (!pushed) { break; } //If we didn't obtain any info this pass, no reason to continue
}

if (current_id == root_id) //Unless they're equal, we failed above
{
//Built the path, will always end with a slash
path = "/";
for (std::vector<std::string>::reverse_iterator i = path_components.rbegin(); i != path_components.rend(); ++i)
{
path += *i+"/";
}
success = true;
}
}
}

return(success);
}

As should be easily discernible, the string manipulation got easier, and mostly vanished. All the keeping track of a slash, and replaces has been replaced with only needing to append "/.." at the end of each loop.
Instead of manipulating a path string for our call to lstat(), we save the directory's file descriptor above (and subsequently use that instead of recalculating it lower down for fstat()), then use it to get the lstat() for each file in the directory, but now via fstatat().
fstatat() is the same as stat()/lstat(), but takes a directory descriptor as the first parameter to offset where the filename is relative to. The last parameter can be 0 or AT_SYMLINK_NOFOLLOW, which makes it act like stat() or lstat() respectively. fstatat() instead of a file descriptor can also take the special value AT_FDCWD to have it automatically work in the current directory.

This implementation should be much more elegant, but a bit less portable. If one would like to implement fstatat() themselves, it's not hard, here's how you can do it:

int fstatat(int dirfd, const char *pathname, struct stat *buf, int flags)
{
int success = -1;

if ((!flags || (flags == AT_SYMLINK_NOFOLLOW)))
{
int cwdfd = -1;
if ((dirfd == AT_FDCWD) || (pathname && (*pathname == '/')) || (((cwdfd=open(".", O_RDONLY)) != -1) && !fchdir(dirfd)))
{
success = (!flags) ? stat(pathname, buf) : lstat(pathname, buf);
}

if (cwdfd != -1)
{
fchdir(cwdfd);
close(cwdfd);
}
}
else
{
errno = EINVAL;
}

return(success);
}

You'll also need the defines for AT_FDCWD and AT_SYMLINK_NOFOLLOW. You have to make sure that the value you choose for AT_FDCWD can't be a valid file descriptor or the normal failure return value. Therefor I chose -2 in my implementations, but choosing any negative value should be okay. AT_SYMLINK_NOFOLLOW shouldn't matter what it is, and a value of 1 or for all I care, 42, should be fine.
If your OS supports the at functions (currently Linux 2.6.16+ and recent versions of Solaris), it's better not to use a custom implementation, as this isn't thread safe, since it calls fchdir() internally, and for our example, runs counter to the whole point of using fstatat(). It would also be faster to use the original getcwd() implementation from a few days ago than using an emulated fstatat(), since there's less overhead from repeated fchdir() calls.
It's also interesting to note that glibc on Linux now, even for older than 2.6.16 implements fstatat() and the whole slew of at functions even when not supported in the Kernel. It's similar to ours, can be thread unsafe due to changing the working directory, and for some inexplicable reason, segfaulted in certain circumstances when I ran a large battery of tests on it and my implementation against the Kernel's to make sure that mine was working properly.

Anyways, with that out of the way, one can't hope but wonder if it would be possible to also eliminate the need of any path string manipulation for the call to opendir(). Mysteriously, neither Solaris nor Linux have an opendirat() call. If there was such, we could easily keep the previous directory open, till we obtained a new handle for its parent directory.
Baring having an opendirat() function, it'd be nice to implement such a thing. Some UNIX OSs I understand have a function perhaps named fopendir() or fdopendir() to promote a file descriptor to a directory handle. With such a function, we can simply write an opendirat() which calls openat(), then promotes it, but Linux doesn't have a method of promotion, although Solaris does (fdopendir()). Lets hope the people who are standardizing the new at functions for POSIX seriously consider what they're doing, otherwise we won't be able to use them.

Now that we've gotten getcwd() to be safe, we still need realpath() to be. In our realpath() implementation, the only unsafe parts was our function chdir_getcwd() which called chdir() internally as well as getcwd(). getcwd() is now taken care of, but we still need to rewrite the rest of chdir_getcwd() to not actually chdir().
To do such should now be easy, we simply do getcwd(), but with a new start path. We'd make a new function called getcwd_internal() which would take two parameters, one for the current directory, and another to initialize the up one, which we can wrap everything else around. Basically copy your favorite getcwd(), but make the modifications to the beginning like so:

bool getcwd_internal(std::string& path, const std::string& start_dir, std::string& up_path)
{
typedef std::pair<dev_t, ino_t> file_id;

bool success = false;
struct stat sb;
if (!stat(start_dir.c_str(), &sb))
{
file_id current_id(sb.st_dev, sb.st_ino);
if (!stat("/", &sb)) //Get info for root directory, so we can determine when we hit it
{
std::vector<std::string> path_components;
file_id root_id(sb.st_dev, sb.st_ino);

while (current_id != root_id) //If they're equal, we've obtained enough info to build the path
--SNIP--


Now we'd turn getcwd() and chdir_getcwd() into wrappers like so:

bool getcwd(std::string& path)
{
std::string up_path("..");
return(getcwd_internal(path, ".", up_path));
}

bool chdir_getcwd(const std::string& dir, std::string& path)
{
std::string up_path(dir+"/..");
return(getcwd_internal(path, dir, up_path));
}


Note that the name of chdir_getcwd() is now misleading that it doesn't actually change directories anymore. For this reason, it's a good idea to not put any implementation details into a function name, as a user shouldn't have to know about them, and over time, it can become inaccurate.

And there you have it folks, getcwd() and realpath() implemented safely. Hopefully in the process all you loyal readers learned a couple other things too.

Other improvements over all we discussed, might be to run a smart pass over concatenated path names at key locations to remove extra slashes together "///", unneeded current directory "/./", and extra parent directories "path1/path2/.." -> "path1/", which is useful if certain functions like opendir() or the stat() family of calls have internal buffer issues. But doing such is a discussion for another time.

This wraps up our long discussion. All comments and suggestions are welcome.

39 comments:

insane coder said...

Just wanted to update to say that it seems Linux does now have fdopendir(), it just doesn't seem to be listed in the manpages yet.

Anonymous said...

Great Blog!!! Was an interesting blog with a clear concept. And will surely help many to update them.
ReactJS Training in Chennai
ReactJS Training
gst classes in chennai
ux design course in chennai
ReactJS course
Web Designing Course in Chennai
Ethical Hacking Course in Chennai
Tally Course in Chennai

Riya Raj said...

Magnificent article!!! the blog which you have shared is informative...Thanks for sharing with us...
Digital Marketing Training in Coimbatore
Digital Marketing Course in Coimbatore
digital marketing institute in bangalore
best digital marketing courses in bangalore
digital marketing courses in bangalore
digital marketing training in bangalore
Software Testing Course in Coimbatore
Spoken English Class in Coimbatore
Web Designing Course in Coimbatore
Tally Course in Coimbatore

ravi kumar said...

Thanks for provide great informatic and looking beautiful blog, really nice required information & the things i never imagined and i would request, wright more blog and blog post like that for us. Thanks you once agian

Birth certificate in delhi
Birth certificate in ghaziabad
Birth certificate in gurgaon
Birth certificate in noida
How to get birth certificate in ghaziabad
how to get birth certificate in delhi
birth certificate agent in delhi
how to download birth certificate
birth certificate in greater noida
birth certificate agent in delhi
Birth certificate delhi

keanna said...

At APTRON, we are perceived for the Best Amazon Web Services Training in Gurgaon, with free demo classes, total lab offices, and master direction from staff who are specialists in the field. We guarantee 100% placement help and qualified certification to enable understudies to encourage their vocation. Get bits of knowledge to creative thoughts and access to involvement progressively.

For More Info:- AWS Training Institute in Gurgaon

Feel More said...

Excellent post gained so much of information, Keep posting like this.
snafi 20 mg

Feel More said...

Nice blog!! I hope you will share more info like this. I will use this for my studies and research

snafi tablets
snafi tablet
snafi
snafi tablet in pakistan
snafi price
snafi tablet price in pakistan

Suji Sanjana said...

Nice Blog.........

CCNA training in chennai | Matlab training in chennai

YahooTechSupport said...

Thanks for sharing this post.
For more info: https://www.youtube.com/watch?v=OunHWCS--Zk&feature=youtu.be

manimaran said...

It is amazing and wonderful to visit your site.Thanks for sharing this information,this is useful to me...
http://trainingsinvelachery.in/creo-training-in-velachery/
http://trainingsinvelachery.in/building-estimation-and-costing-training-in-velachery/
http://trainingsinvelachery.in/r-programming-training-in-velachery/
http://trainingsinvelachery.in/big-data-training-in-velachery/
http://trainingsinvelachery.in/machine-learning-training-in-velachery/
http://trainingsinvelachery.in/data-science-training-in-velachery/

augustwalker said...

how to Fix HP Laptop Shuts Down Issue

Ethan jurk said...

Ensure that the PC and system device setup is finished else you won't have the option to utilize that machine on the system (wired or remote) regardless of whether you are performing right directions. visit -- Connect Canon MF4770N Printer

TerA said...

DUDE HAS THE BEST XXX WITH FRIEND'S MOM
.
.
.
.
.
AND SHARED THE FULL XXX,
.
.
.
.
CHECK HERE====>>> SXX IN THE STRIP CLUB

Ragani Tiwari said...

What a post? I appreacite your idea or your efforts in the post. Thank you for sharing.
Akshi Engineers has the resources to manufacture Gear Boxes Manufacturer in India at affordable and reasonable prices. To learn more contact us to discuss how we can fulfill your requirement of Gear Boxes as soon as doable. Our Company offering all the equipment relates to the rolling mill industry with 100% satisfaction or quality at best prices in India and other countries. For more information contact us or visit our website.

hassam said...

His estimated net worth is around a whopping $16.2 billion, as of March 2021 according to our news reports..
Leon Black net Worth

H said...

He is a Scottish professional wrestler. He is signed to WWE, where he performs on the Raw brand under the ring name Drew McIntyre
Drew McIntyre Personal Information

Training in bangalore said...

IntelliMindz is the best IT Training in Bangalore with placement, offering 200 and more software courses with 100% Placement Assistance.

Python Course in Bangalore
React Course In Bangalore
Automation Training In Bangalore
Blue Prism Courseourse In Bangalore
RPA Course In Bangalore
UI Path Training In Bangalore
Clinical SAS Training In Bangalore
Oracle DBA Training In Bangalore
IOS Training In Bangalore
<a href="https://intellimindz.com/tally-course-in-bangalore/>Tally Course In Bangalore</a>

JohnMillar said...

Genuine Post. I appreciate your knowledge and your hard work. Thanks for sharing with us.
Hand in Hand Remodel & Repair LLC is the most reputed Water Remediation Company. Our electricians are quick to respond and use state-of-the-art equipment and processes to rapidly repair water damage. When you have water damage problems, call the water damage restoration experts at Hand in Hand Remodel & Repair LLC. With just one call, we'll be by your side, giving you advice on what to do and have the experience to fix problems quickly. We are ready for emergency water damage services 24/7.

cally jesper said...

Unique Information. Thanks for giving me this information. Keep sharing again.
Are you need an electrician in Riverside, CA? Todd Peters Electric provide affordable service at Riverside. Hire the Best Riverside Electrician, CA for electrical requirements, installation and repairing services. Our team provide a best solution for any electrical requirements, visit the website.

MBBS in Philippines said...

Wisdom Overseasis authorized India's Exclusive Partner of Southwestern University PHINMA, the Philippines established its strong trust in the minds of all the Indian medical aspirants and their parents. Under the excellent leadership of the founder Director Mr. Thummala Ravikanth, Wisdom meritoriously won the hearts of thousands of future doctors and was praised as the “Top Medical Career Growth Specialists" among Overseas Medical Education Consultants in India.

Southwestern University PHINMAglobally recognized university in Cebu City, the Philippines facilitating educational service from 1946. With the sole aim of serving the world by providing an accessible, affordable, and high-quality education to all the local and foreign students. SWU PHINMA is undergoing continuous changes and shaping itself as the best leader with major improvements in academics, technology, and infrastructure also in improving the quality of student life.

cally jesper said...

This post has lots of information. It is very useful information for me. Thanks for sharing with us.
Are you looking for Riverside Electrician? Todd Peters Electrical is a Riverside, California-based company that provides qualified and competent Electricians. If you need an emergency electrician 24*7 to perform repairs and maintenance, visit our website.

JohnMillar said...

Awesome post. It is very helpful information for me. Thanks for sharing.
Are you experiencing Water Damage Restoration issues? Don’t worry. Hand In Hand Remodel And Repair provides one of the best Water Damage Remediation solutions. Our skilled and trained professionals are here to provide top-notch solutions to fix water damage repair issues. To Hire Trained Professionals to Fix Water Damage Solutions, Contact us and visit our website.

solar panel said...

One of such solution for me. Thanks for sharing with us.
In case you are searching for Orange County Solar Companies at reasonable or sensible costs. Look no further, Burge Solar Power is one of the most incredible solar panel based organizations in California that works in introducing the best Solar Panel provider at residential and commercial sites. For more information, you can call us or visit our website:

Easy Loan Mart said...

Hi....
in PATH is a huge security risk: you can cd to some user's directory, and unintentionally run a malicious script there only because you mistyped a thing or script that has the same name as a system-wide binary.
You are also read more Business Loan Interest Rate

Unknown said...

เดิมพันออนไลน์ ทางเข้ามือถือ BETFLIX สมัครสมาชิก แทงบอล หวยออนไลน์ สล็อต บาคาร่า เกมยิงปลา สมัครครั้งแรก รับโบนัสเครดิตฟรี.

Inrain said...

After seeing the benefits of open source development agency in USA , many companies are considering adopting this strategy
If your business aims to reach potential customers rapidly, you need a website! The web design company in USA.
Often some businesses make the worst mistake of risking in mobile app development and design by hiring uneducated mobile app designing services in USA.

NURSERY ADMISSIONS said...

You should start from the Recruitment Agencies in Gurgaon that give you opportunities to interact with professionals. In today’s world, only knowing skills is not important,
We provide Manpower Consultancy Services friendly interactions with our staff who also guide you along the way till you reach your destination.

Apostille said...

We provide Apostille Ministry Of External Affairs India, Attestation & Authentication document though the various process and handled by industries experienced consultants.

Mobtekvat said...

Our Body Massage centre in Noida, blends excellence, well being, peace of mind and many more. Best Body Massage Offers in Noida, Get 20-50% OFF on the finest range of Body Massage treatments near you by trained masseurs.best massage spa noida We provide all kinds of body massage and spa services at our body massage parlour in Noida,

as8836980@gmail.com said...

Mobile Phone Repair & Services Acharya Niketan Mayur Vihar Phase 1.tablet repair shop in mayur vihar phase Authorised Service Center – Samsung Mobile Service Centre in mayur vihar phase -1, for all your repairs, installations, battery issues, etc.

Mobtekvat said...

We specialise in massage , come for relaxing, we will help you to release stresservices · Hot Stone MassageSpa in Noida for body massages · Full Body Massage · Foot and Ankle Rungu Massage · Aromatherapy Massage · Indian Head Massage · Express Facials · Sauna Packages.Unique Wellness Spa provides the services of massage therapist who have been working in this field for 6 years.

Lexmark-Printer-Setup said...

Why I am getting printer offline error message?


You may generally find many users screaming, “Why my printer has gone offline?” It results in the non-working of your printer and halts all your present and upcoming printing jobs. On a positive note, the error of “printer offline” can be resolved, as it’s important to know and understand the reasons behind it. Despite the fact all leading printer brands (HP, Epson, Canon, etc) are delivering the most sophisticated printers with advanced technologies, the issue of “printer offline” is still troubling every printer user.

Visit Us: https://printersofflines.com/

Contact Us: +1-970-794-0109

hp printer said...

How to diagnose the issue of HP printer offline?

Against all odds, every HP printer user faces the roadblock of getting their HP printer offline, which infuriates them. Commonly speaking, an HP printer offline issue occurs when there is any type of communication problem between your HP printer and PC or laptop. It clearly indicates that your HP printer has failed to receive any print jobs. In addition, if you receive HP printer offline message then it means that new commands for printing cannot be given or processed.

Visit Us: http://printersofflines.com/hp-printer-offline/

Contact Us: +1-970-794-0109

We are a proficiently trained and experienced technical support business enterprise who have expertise in resolving all issues related to HP printers, including HP printer offline.

hp printer said...

What are the effective steps for Epson printer offline fix?

Epson printer users face the hindrance of Epson Printer Offline error often and there are bountiful reasons behind it. Generally speaking, after proper diagnosis, it has been observed that loose connections and damaged USB connections are the real causes of this error. If you have not set your Epson printer as your default printer or by mistake, have set your Epson printer as Use Printer Offline, even then you will be facing this issue. So, whatever may be the reason behind the Epson printer offline error, we will now briefly highlight the troubleshooting steps for the Epson printer offline fix so that we can bring back our printing job to normal working.

Visit Us: http://printersofflines.com/epson-printer-offline/

Contact Us: +1-970-794-0109

Our esteemed business organization is empowered by a team of technicians who are proficient in all resolving issues related to Epson printers, including the Epson printer offline fix.

hp printer said...

How to diagnose the issue of HP printer offline?

HP printer offline: HP, one of the leading technical giants in the international tech world, had produced remarkably brilliant printers and scanners, which have not only enhanced work efficiency but have given them a formidable position in the business arena of printing technology. With the advent of wireless printing, HP has taken big strides and has given phenomenal products.



Visit Us: http://printersofflines.com/hp-printer-offline/

Contact Us: +1-970-794-0109

Our team of professionals has experience and skills in fixing all issues related to Epson printer offline fix, including Epson printer offline.

Norton Login said...

One couldn't deny the fact that Norton is an extremely reliable choice for cybersecurity. It has all the power needed to effectively fend off online attacks. Additionally though, you also need a Norton account for easy management of your Norton subscription and Norton product download. The following information will help you easily do the Norton login and access the My Nortan web portal dashboard. .You have to go to the Norton login page first. The url is https://memberportal.lifelock.com/ https://threatprotections.com/nortonlifelock-login.php

Best Services said...

Your article waqs very helpful for me. You have resolved my problem. I was stuck in my project. You helped me.Now it's time to avail Hi Vis Traffic Jacket for more information.

Sachin said...

Visesh Infotech Share Price Target 2022, 2023, 2024 , 2025 and 2030

If you want to invest in Visesh Infotech Limited then you will get your answer in this post. we will also discuss the strengths and weaknesses of Visesh infotech Ltd.




SHARE PRICE TARGET


Brceramics said...

Vitrified tiles have become a popular choice for both residential and commercial flooring solutions due to their durability, aesthetic appeal, and ease of maintenance. These tiles undergo a unique manufacturing process that involves vitrification, where the tiles are subjected to high temperatures to create a glass-like, non-porous surface.

Vitrified Tiles
https://brceramics.com/vitrified-tiles