GREP - finding all occurrences of a string

Question

GREP - finding all occurrences of a string

2.2k views Asked by AudioBubble At 23 November 2009 at 20:37

I am tasked with white labeling an application so that it contains no references to our company, website, etc. The problem I am running into is that I have many different patterns to look for and would like to guarantee that all patterns are removed. Since the application was not developed in-house (entirely) we cannot simply look for occurrences in messages.properties and be done. We must go through JSP's, Java code, and xml.

I am using grep to filter results like this:

grep SOME_PATTERN . -ir | grep -v import | grep -v // | grep -v /* ...

The patterns are escaped when I'm using them on the command line; however, I don't feel this pattern matching is very robust. There could possibly be occurrences that have import in them (unlikely) or even /* (the beginning of a javadoc comment).

All of the text output to the screen must come from a string declaration somewhere or a constants file. So, I can assume I will find something like:

public static final String SOME_CONSTANT = "SOME_PATTERN is currently unavailable";

I would like to find that occurrence as well as:

public static final String SOME_CONSTANT = "
SOME_PATTERN blah blah blah";

Alternatively, if we had an internal crawler / automated tests, I could simply pull back the xhtml from each page and check the source to ensure it was clean.

Original Q&A

There are 2 answers

grossvogel On 23 November 2009 at 21:07

To address your concern about missing some occurrences, why not filter progressively:

Create a text file with all possible matches as a starting point.
Use filter X (grep for '^import', for example) to dump probable false positives into a tmp file.
Use filter X again to remove those matches from your working file (a copy of [1]).
Do a quick visual pass of the tmp file and add any real matches back in.
Repeat [2]-[4] with other filters.

This might take some time, of course, but it doesn't sound like this is something you want to get wrong...

**psihodelia** · Accepted Answer · 2009-11-23T20:48:43+00:00

I would use sed, not grep! Sed is used to perform basic text transformations on an input stream. Try s/regexp/replacement/ option with sed command.

You can also try awk command. It has an option -F for fields separation, you can use it with ; to separate lines of you files with ;.

The best solution will be however a simple script in Perl or in Python.

TechQA.

GREP - finding all occurrences of a string

There are 2 answers

Related Questions in GREP

Related Questions in PATTERN-MATCHING

Related Questions in WHITE-LABELLING

Popular Questions

Popular Tags

Trending Questions