I'm doing research in the field of emotion recognition. For this purpose I need to catch and classify particular face details like eyes, nose, mouth, etc. Standard OpenCV function for this is detectMultiScale(), but its disadvantage is that it returns list of rectangles (video) while I'm mostly interested in particular key points - corners of mouth, upper and lower points, edges, etc (video). 
So, how do they do it? OpenCV is ideal, but other solutions are ok too.
 
                        
To analyse such precise points, you can use Active appearance models. Your second video seems to be done with AAM. Check out above wikipedia link, where you can get a lot of AAM tools and API.
On the other hand, if you can detect mouth using haar-cascade, apply colour filtering. Obviously lips and surrounding region has color difference. You get precise model of lips and find its edges.
Check out this paper: Lip Contour Extraction