How do I get a list of files from a web directory?

20.9k views Asked by At

How do I get a list of files from a web directory? If I access the web directory URL the internet browser list all the files in that directory. Now I just want to get that list in C# and download them in BITS (Background Intelligent Transfer Service) .

5

There are 5 answers

1
Jay Riggs On

This is an interesting topic I investigated fairly recently. As you know you can access BITS via COM, but here are a couple projects to make it easier:

SharpBITS.NET
Forms Designer Friendly Background Intelligent Transfer Service (BITS) wrapper

This article on MSDN might be a bit more than you want to know.

I experimented with the code in the CodeProject link and it seemed to work reasonably well. The CodePlex project looks really good but I haven't tried it.

3
Rubens Farias On

About "get that list in C#" part:

foreach (string filename in 
    Directory.GetFiles(
        Server.MapPath("/"), "*.jpg", 
        SearchOption.AllDirectories))
{
    Response.Write(
        String.Format("{0}<br />", 
            Server.HtmlEncode(filename)));
}
0
Mike Gleason jr Couturier On

Well, if the Web Server allows to list the files the directory in question, you're good to go.

Unfortunately, there's no standard on how the web server should return you the list. It is often in HTML, but the HTML is not always formatted the same across multiple web servers.

If you want to download files always from the same directory on the same web server, just do a "view source" while being in the directory in your web browser. Then try to write a small regular expression that will grab every file names from the HTML source.

You can then create a WebClient, request the directory URL, parse the response to get the file names with your regular expression, then process the files with your BITS client

Hope this helps

0
yu yang Jian On

I write some code that can get all path infos, including file and dir, from the IIS site which allow list directory. You can customize the regex to match your need (Or change to use html parser). Further you can add some code yourself to get more detailed info, like file size or create time.

you can get all path infos in 2 lines:

List<PathInfo> pathInfos = new List<PathInfo>();
HttpHelper.GetAllFilePathAndSubDirectory("http://localhost:33333/", pathInfos);

The helper code:

public static class HttpHelper
{
    public static string ReadHtmlContentFromUrl(string url)
    {
        string html = string.Empty;
        HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);

        using (HttpWebResponse response = (HttpWebResponse)request.GetResponse())
        using (Stream stream = response.GetResponseStream())
        using (StreamReader reader = new StreamReader(stream))
        {
            html = reader.ReadToEnd();
        }
        //Console.WriteLine(html);
        return html;
    }

    public static void GetAllFilePathAndSubDirectory(string baseUrl, List<PathInfo> pathInfos)
    {
        Uri baseUri = new Uri( baseUrl.TrimEnd('/') );
        string rootUrl = baseUri.GetLeftPart(UriPartial.Authority);


        Regex regexFile = new Regex("[0-9] <a href=\"(http:|https:)?(?<file>.*?)\"", RegexOptions.IgnoreCase);
        Regex regexDir = new Regex("dir.*?<a href=\"(http:|https:)?(?<dir>.*?)\"", RegexOptions.IgnoreCase);

        string html = ReadHtmlContentFromUrl(baseUrl);
        //Files
        MatchCollection matchesFile = regexFile.Matches(html);
        if (matchesFile.Count != 0)
            foreach (Match match in matchesFile)
                if (match.Success)
                    pathInfos.Add(
                        new PathInfo( rootUrl + match.Groups["file"], false));
        //Dir
        MatchCollection matchesDir = regexDir.Matches(html);
        if (matchesDir.Count != 0)
            foreach (Match match in matchesDir)
                if (match.Success)
                {
                    var dirInfo = new PathInfo(rootUrl + match.Groups["dir"], true);
                    GetAllFilePathAndSubDirectory(dirInfo.AbsoluteUrlStr, dirInfo.Childs);
                    pathInfos.Add(dirInfo);
                }                        

    }


    public static void PrintAllPathInfo(List<PathInfo> pathInfos)
    {
        pathInfos.ForEach(f =>
        {
            Console.WriteLine(f.AbsoluteUrlStr);
            PrintAllPathInfo(f.Childs);
        });
    }

}



public class PathInfo
{
    public PathInfo(string absoluteUri, bool isDir)
    {
        AbsoluteUrl = new Uri(absoluteUri);
        IsDir = isDir;
        Childs = new List<PathInfo>();
    }

    public Uri AbsoluteUrl { get; set; }

    public string AbsoluteUrlStr
    {
        get { return AbsoluteUrl.ToString(); }
    }

    public string RootUrl
    {
        get { return AbsoluteUrl.GetLeftPart(UriPartial.Authority); }
    }

    public string RelativeUrl
    {
        get { return AbsoluteUrl.PathAndQuery; }
    }

    public string Query
    {
        get { return AbsoluteUrl.Query; }
    }

    public bool IsDir { get; set; }
    public List<PathInfo> Childs { get; set; }


    public override string ToString()
    {
        return String.Format("{0} IsDir {1} ChildCount {2} AbsUrl {3}", RelativeUrl, IsDir, Childs.Count, AbsoluteUrlStr);
    }
}
0
Yordan Georgiev On
private void ListFiles()
{

    //get the user calling this page 
    Gaf.Bl.User userObj = base.User;
    //get he debug directory of this user
    string strDebugDir = userObj.UserSettings.DebugDir;
    //construct the Directory Info directory 
    DirectoryInfo di = new DirectoryInfo(strDebugDir);
    if (di.Exists == true)
    {

        //get the array of files for this 
        FileInfo[] rgFiles = di.GetFiles("*.html");
        //create the list ... .it is easier to sort ... 
        List<FileInfo> listFileInfo = new List<FileInfo>(rgFiles);
        //inline sort descending by file's full path 
        listFileInfo.Sort((x, y) => string.Compare(y.FullName, x.FullName));
        //now print the result 
        foreach (FileInfo fi in listFileInfo)
        {
            Response.Write("<br><a href=" + fi.Name + ">" + fi.Name + "</a>");
        } //eof foreach
    } //eof if dir exists

} //eof method