I wrote a simple text STL (Standard Tessellation Library) parser using Attoparsec. STL contains a collection of facets. Each facet contains a normal, and vertices of triangle. Typical STL file can be large (~100 Mb or more).
STL file format is
solid
facet normal nx ny nz
outer loop
vertex x1 y1 z1
vertex x2 y2 z2
vertex x3 y3 z3
endfacet
endsolid
The parser code is here. Full implementation can be found at STLReader
-- | Point represents a vertex of the triangle
data Point a = Point !a !a !a deriving Show
-- | Vector for a given class
data Vector a = Vector !a !a !a deriving Show
-- | Parse coordinate triplet.
coordinates :: (Fractional a) => Text -> (a -> a -> a -> b) -> Parser b
coordinates s f = do
skipSpace
string s
!x <- coordinate
!y <- coordinate
!z <- coordinate
return $! f x y z
where
coordinate = skipWhile isHorizontalSpace *> fmap realToFrac double
{-# INLINE coordinates #-}
type RawFacet a = (Vector a, Point a, Point a, Point a)
-- | Parse a facet. The facet comprises of a normal, and three vertices
facet :: Fractional a => Parser (RawFacet a)
facet = (,,,) <$> beginFacet
<* (skipSpace *> "outer loop")
<*> vertexPoint
<*> vertexPoint
<*> vertexPoint
<* (skipSpace <* "endloop" <* endFacet )
<?> "facet"
where
beginFacet = skipSpace <* "facet" *> coordinates "normal" Vector
endFacet = skipSpace <* string "endfacet"
vertexPoint = coordinates "vertex" Point
{-# INLINE facet #-}
rawFacets :: Fractional a => Parser [RawFacet a]
rawFacets = beginSolid *> many' facet <* endSolid
where
solidName = option "default" (skipWhile isHorizontalSpace *> fmap T.pack (many1 $ satisfy isAlphaNum) )
beginSolid = skipSpace <* "solid" *> solidName <?> "start solid"
endSolid = skipSpace <* "endsolid" <?> "end solid"
-- | Read text STL file. STL extensions for color etc. are not supported in this version.
readTextSTL :: Fractional a => FilePath -> IO (Either String [RawFacet a])
readTextSTL path = liftM (Al.eitherResult . Al.parse rawFacets) (TIO.readFile path)
main :: IO Int
main = do
(path:_) <- getArgs
putStrLn $ "Parsing STL file: " ++ path
s <- readTextSTL path
putStrLn "Parsing complete"
case s of
Left error -> putStrLn error
Right s -> putStrLn $ "Num facets : " ++ show (length s)
return 0
I benchmarked this code with 'c' parser supplied with Meshlab. When I tested with 69Mb scan, meshlab completed the job in roughly 13 s, whereas it took 22 s with attoparsec. So though attoparsec enabled me to parse faster than Parsec (with parsec it was about 36 s), I still have a long way to go.
How can I improve this parser further?