Regex to get vCard base64 string (C#)

365 views Asked by At

I'm having some trouble creating a regex to obtain the base64 string, image type, and encoding of a vcard image. What would be the best way to obtain this, given the example input below.

BEGIN:VCARD
VERSION:2.1
N;LANGUAGE=en-us:User;Test
FN:Test User
ORG:NEI Global
TITLE:Network Administrator
NOTE;ENCODING=QUOTED-PRINTABLE:This is a test. =0D=0A=
=0D=0A=
Thanks!!!=0D=0A=
=0D=0A=
Test User
TEL;WORK;VOICE:(402) 201-3438
X-MS-OL-DEFAULT-POSTAL-ADDRESS:0
EMAIL;PREF;INTERNET:[email protected]
X-MS-CARDPICTURE;TYPE=JPEG;ENCODING=BASE64:
 /9j/4AAQSkZJRgABAQEAYABgAAD/2wBDAAcFBQYFBAcGBQYIBwcIChELCgkJChUPEAwRGBUa
 GRgVGBcbHichGx0lHRcYIi4iJSgpKywrGiAvMy8qMicqKyr/2wBDAQcICAoJChQLCxQqHBgc
 KioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKir/wAAR
 CACUACcDASIAAhEBAxEB/8QAHwAAAQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL/8QAtRAA
 AgEDAwIEAwUFBAQAAAF9AQIDAAQRBRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkK
 FhcYGRolJicoKSo0NTY3ODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqDhIWG
 h4iJipKTlJWWl5iZmqKjpKWmp6ipqrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uHi4+Tl
 5ufo6erx8vP09fb3+Pn6/8QAHwEAAwEBAQEBAQEBAQAAAAAAAAECAwQFBgcICQoL/8QAtREA
 AgECBAQDBAcFBAQAAQJ3AAECAxEEBSExBhJBUQdhcRMiMoEIFEKRobHBCSMzUvAVYnLRChYk
 NOEl8RcYGRomJygpKjU2Nzg5OkNERUZHSElKU1RVVldYWVpjZGVmZ2hpanN0dXZ3eHl6goOE
 hYaHiImKkpOUlZaXmJmaoqOkpaanqKmqsrO0tba3uLm6wsPExcbHyMnK0tPU1dbX2Nna4uPk
 5ebn6Onq8vP09fb3+Pn6/9oADAMBAAIRAxEAPwD393C4B6mgHqSeKhc7nOPoKRj2HQUAPef+
 5z7mlj5GT19agAyasD5U59M0AQyfPNj0opIhliaKAHDjJ9BSUp+7SUAEYy1SzHEZ9+KbCOaS
 Y8gUANU7Yie+aKJOFVaKAFbrxSUdqB1oAnjHy5qA/NN+NWD8kf0FV4uWJ9qAElOXxRTWbJJ9
 6KAH9qdGPmFN7VLEP5YoAJmwgHrUa/LGT60sxy+PQUEEqFXk96AIKKmEar9880UAA5OKlQ4X
 jknmkDIei8/SlclUOOKAG+WN2X5J7CmvIRwo28UkfLEnrimuQXNADVG5vmNFJjJwKKALEY+Y
 fnSzHoPxpY1+U/lTHZdx4yenPQUAEYIXPvUeFXqcn0FSOf3fPHAHFRUABc9F+X6UUyigC8eE
 57VXA3MPc1PKcRn3qFOpPoKACQ9PxNR06T72PTim0AMooooAszH5gPSkj+7z3NI5yx+tOPCf
 QfzoAhJyc0UHrRQAyiiigCyH3EAilYrjnuaYnUn0FI/UD0FABsU9P0NIY/f8xTT1oDEdCRQA
 bG7c/Q0Ubz3AP4UUASp91vwprffb60UUAMPWiiigBlFFFAH/2Q==

X-MS-OL-DESIGN;CHARSET=utf-8:<card xmlns="http://schemas.microsoft.com/office/outlook/12/electronicbusinesscards" ver="1.0" layout="left" bgcolor="ffffff"><img xmlns="" align="fit" area="16" use="cardpicture"/><fld xmlns="" prop="name" align="left" dir="ltr" style="b" color="8000ff" size="13"/><fld xmlns="" prop="org" align="left" dir="ltr" color="000000" size="8"/><fld xmlns="" prop="title" align="left" dir="ltr" color="000000" size="8"/><fld xmlns="" prop="blank" size="8"/><fld xmlns="" prop="telwork" align="left" dir="ltr" color="d48d2a" size="8"><label align="right" color="626262">Work</label></fld><fld xmlns="" prop="email" align="left" dir="ltr" color="d48d2a" size="8"/><fld xmlns="" prop="blank" size="8"/><fld xmlns="" prop="blank" size="8"/><fld xmlns="" prop="blank" size="8"/><fld xmlns="" prop="blank" size="8"/><fld xmlns="" prop="blank" size="8"/><fld xmlns="" prop="blank" size="8"/><fld xmlns="" prop="blank" size="8"/><fld xmlns="" prop="blank" size="8"/><fld xmlns="" prop="blank" size="8"/><fld xmlns="" prop="blank" size="8"/></card>
REV:20150609T021123Z
END:VCARD

The below example works for the note property, however it only obtains the image string if there is something on the same line. How can this be modified to include the entire base64 string for the image property?

public void ParseLines(string s)
{
    RegexOptions options = RegexOptions.IgnoreCase | RegexOptions.Multiline | RegexOptions.IgnorePatternWhitespace;

    Regex regex;
    Match m;
    MatchCollection mc;


    ///Note
    regex = new Regex(@"((?<strElement>(NOTE)) (;*(?<strAttr>(ENCODING=QUOTED-PRINTABLE)))*  ([^:]*)*  (:(?<strValue> (([^\n\r]*=[\n\r]+)*[^\n\r]*[^=][\n\r]*) )))", options);
    m = regex.Match(s);
    if (m.Success)
    {
        Note = m.Groups["strValue"].Value;
        //Remove connections and escape strings. The order is significant.
        Note = Note.Replace("=" + Environment.NewLine, "");
        Note = Note.Replace("=0D=0A" , Environment.NewLine);
        Note = Note.Replace("=3D", "=");
    }

    ///Image (NOT WORKING YET)
    regex = new Regex(@"(\n(?<strElement>(X-MS-CARDPICTURE)) (;*(?<strType>(TYPE=JPEG|TYPE=PNG)))*  (;(?<strAttr>(ENCODING=BASE64)))* (;[^:]*)*  (:(?<strValue>\r\n)))", options);
    mc = regex.Matches(s);
    if (mc.Count > 0)
    {
        for (int i = 0; i < mc.Count; i++)
        {
            m = mc[i];
            CP.encoding = m.Groups["strAttr"].Value; //WORKS
            CP.image = m.Groups["strValue"].Value;  //WORKS
            CP.type = m.Groups["strType"].Value;  // ONLY GETS THINGS ON THE SAME LINE


        }
    }


}
0

There are 0 answers