How do I get a list of files from a web directory? If I access the web directory URL the internet browser list all the files in that directory. Now I just want to get that list in C# and download them in BITS (Background Intelligent Transfer Service) .
How do I get a list of files from a web directory?
20.9k views Asked by Eric AtThere are 5 answers
Well, if the Web Server allows to list the files the directory in question, you're good to go.
Unfortunately, there's no standard on how the web server should return you the list. It is often in HTML, but the HTML is not always formatted the same across multiple web servers.
If you want to download files always from the same directory on the same web server, just do a "view source" while being in the directory in your web browser. Then try to write a small regular expression that will grab every file names from the HTML source.
You can then create a WebClient, request the directory URL, parse the response to get the file names with your regular expression, then process the files with your BITS client
Hope this helps
I write some code that can get all path infos, including file and dir, from the IIS site which allow list directory. You can customize the regex to match your need (Or change to use html parser). Further you can add some code yourself to get more detailed info, like file size or create time.
you can get all path infos in 2 lines:
List<PathInfo> pathInfos = new List<PathInfo>();
HttpHelper.GetAllFilePathAndSubDirectory("http://localhost:33333/", pathInfos);
The helper code:
public static class HttpHelper
{
public static string ReadHtmlContentFromUrl(string url)
{
string html = string.Empty;
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
using (HttpWebResponse response = (HttpWebResponse)request.GetResponse())
using (Stream stream = response.GetResponseStream())
using (StreamReader reader = new StreamReader(stream))
{
html = reader.ReadToEnd();
}
//Console.WriteLine(html);
return html;
}
public static void GetAllFilePathAndSubDirectory(string baseUrl, List<PathInfo> pathInfos)
{
Uri baseUri = new Uri( baseUrl.TrimEnd('/') );
string rootUrl = baseUri.GetLeftPart(UriPartial.Authority);
Regex regexFile = new Regex("[0-9] <a href=\"(http:|https:)?(?<file>.*?)\"", RegexOptions.IgnoreCase);
Regex regexDir = new Regex("dir.*?<a href=\"(http:|https:)?(?<dir>.*?)\"", RegexOptions.IgnoreCase);
string html = ReadHtmlContentFromUrl(baseUrl);
//Files
MatchCollection matchesFile = regexFile.Matches(html);
if (matchesFile.Count != 0)
foreach (Match match in matchesFile)
if (match.Success)
pathInfos.Add(
new PathInfo( rootUrl + match.Groups["file"], false));
//Dir
MatchCollection matchesDir = regexDir.Matches(html);
if (matchesDir.Count != 0)
foreach (Match match in matchesDir)
if (match.Success)
{
var dirInfo = new PathInfo(rootUrl + match.Groups["dir"], true);
GetAllFilePathAndSubDirectory(dirInfo.AbsoluteUrlStr, dirInfo.Childs);
pathInfos.Add(dirInfo);
}
}
public static void PrintAllPathInfo(List<PathInfo> pathInfos)
{
pathInfos.ForEach(f =>
{
Console.WriteLine(f.AbsoluteUrlStr);
PrintAllPathInfo(f.Childs);
});
}
}
public class PathInfo
{
public PathInfo(string absoluteUri, bool isDir)
{
AbsoluteUrl = new Uri(absoluteUri);
IsDir = isDir;
Childs = new List<PathInfo>();
}
public Uri AbsoluteUrl { get; set; }
public string AbsoluteUrlStr
{
get { return AbsoluteUrl.ToString(); }
}
public string RootUrl
{
get { return AbsoluteUrl.GetLeftPart(UriPartial.Authority); }
}
public string RelativeUrl
{
get { return AbsoluteUrl.PathAndQuery; }
}
public string Query
{
get { return AbsoluteUrl.Query; }
}
public bool IsDir { get; set; }
public List<PathInfo> Childs { get; set; }
public override string ToString()
{
return String.Format("{0} IsDir {1} ChildCount {2} AbsUrl {3}", RelativeUrl, IsDir, Childs.Count, AbsoluteUrlStr);
}
}
private void ListFiles()
{
//get the user calling this page
Gaf.Bl.User userObj = base.User;
//get he debug directory of this user
string strDebugDir = userObj.UserSettings.DebugDir;
//construct the Directory Info directory
DirectoryInfo di = new DirectoryInfo(strDebugDir);
if (di.Exists == true)
{
//get the array of files for this
FileInfo[] rgFiles = di.GetFiles("*.html");
//create the list ... .it is easier to sort ...
List<FileInfo> listFileInfo = new List<FileInfo>(rgFiles);
//inline sort descending by file's full path
listFileInfo.Sort((x, y) => string.Compare(y.FullName, x.FullName));
//now print the result
foreach (FileInfo fi in listFileInfo)
{
Response.Write("<br><a href=" + fi.Name + ">" + fi.Name + "</a>");
} //eof foreach
} //eof if dir exists
} //eof method
This is an interesting topic I investigated fairly recently. As you know you can access BITS via COM, but here are a couple projects to make it easier:
SharpBITS.NET
Forms Designer Friendly Background Intelligent Transfer Service (BITS) wrapper
This article on MSDN might be a bit more than you want to know.
I experimented with the code in the CodeProject link and it seemed to work reasonably well. The CodePlex project looks really good but I haven't tried it.