Remove Multiline Comments only from Top of Every Java File

273 views Asked by At

We once used borland starteam tool (one of the kind of revision/source code control system like mercurial) for our code management. Whenever we commit the code, the tool itself puts a description of the commit at the top of the file. So now we have many classes in the code where at the top of each file. For example:

/*This is some developer comment at the top of the file*/

/*
 * $Log:
 *  1   Client Name 1.0   07/11/2012 16:28:54  Umair Khalid did something
 *  2   Client Name 1.0   07/11/2012 16:28:54  Umair Khalid again did 
 *                                             something
 * $
 */

public class ABC
{
  /*This is just a variable*/
  int a = 0;
  public int method1()
  {
  }
}

Now i am planning to remove all this starteam type of the code which is present at the top of each file. But i dont want to remove any other comment from any file or any other copyright comment at the top. I only want to remove that chunk that starts with $Log and ends with $. I have looked at other questions as well related to this problem but this is a multiline comment. Would regular expression be good option for this?

Is there any utility i can use rather then writing my own code to remove this?

If regular expression is the only quick solution, then i am stuck in there.

Any help would be appreciated.

1

There are 1 answers

0
Flydog57 On

If the format is exactly as you show, you could build a fragile little state machine that looks like this.

Start with an enum to track the state:

enum ParseState
{
    Normal,
    MayBeInMultiLineComment,    //occurs after initial /*
    InMultilineComment,
}

and then add this code:

     public static void CommentStripper()
     {
         var text = @"/*This is some developer comment at the top of the file*/
/*
 * $Log:
 *  1   Client Name 1.0   07/11/2012 16:28:54  Umair Khalid did something
 *  2   Client Name 1.0   07/11/2012 16:28:54  Umair Khalid again did 
 *                                             something
 * $
 */

/*
    This is not a log entry
*/

public class ABC
{
  /*This is just a variable*/
  int a = 0;
  public int method1()
  {
  }
}";

    //this next line could be File.ReadAllLines to get the text from a file
    //or you could read from a stream, line by line.

    var lines = text.Split(new[] {"\r\n"}, StringSplitOptions.None);

    var buffer = new StringBuilder();
    ParseState parseState = ParseState.Normal;
    string lastLine = string.Empty;

    foreach (var line in lines)
    {
        if (parseState == ParseState.Normal)
        {
            if (line == "/*")
            {
                lastLine = line;
                parseState = ParseState.MayBeInMultiLineComment;
            }
            else
            {
                buffer.AppendLine(line);
            }
        }
        else if (parseState == ParseState.MayBeInMultiLineComment)
        {
            if (line == " * $Log:")
            {
                parseState = ParseState.InMultilineComment;
            }
            else
            {
                parseState = ParseState.Normal;
                buffer.AppendLine(lastLine);
                buffer.AppendLine(line);
            }
            lastLine = string.Empty;
        }
        else if (parseState == ParseState.InMultilineComment)
        {
            if (line == " */")
            {
                parseState = ParseState.Normal;
            }
        }

    }
    //you could do what you want with the string, I'm just going to write it out to the debugger console.
    Debug.Write(buffer.ToString());
}

Note the lastLine is used because you need to read-ahead one line to pick up whether a comment is a log entry or not (which is what the MayBeInMultiLineComment state tracks).

The output from that looks like:

/*This is some developer comment at the top of the file*/


/*
    This is not a log entry
*/

public class ABC
{
  /*This is just a variable*/
  int a = 0;
  public int method1()
  {
  }
}