Error while parsing a restructured text docstring into HTML

1k views Asked by At

So, I am writing a simple application using the python sklearn library. I need to parse the docstring for any of the sklearn estimator models. I am not familiar with reSTructured text but some quick research from the sklearn "Contributing documentation" page seems to suggest that these docstrings are in reSTructured text format. Following from this question, I have tried doing the following (using the support vector classifier SVC as an example)

from sklearn.svm import SVC
from docutils.core import publish_string
print(publish_string(SVC.__doc__, writer_name='html'))

For anyone who requires it, the raw docstring is

"C-Support Vector Classification.\n\n    The implementations is a based on libsvm. The fit time complexity\n    is more than quadratic with the number of samples which makes it hard\n    to scale to dataset with more than a couple of 10000 samples.\n\n    The multiclass support is handled according to a one-vs-one scheme.\n\n    For details on the precise mathematical formulation of the provided\n    kernel functions and how `gamma`, `coef0` and `degree` affect each,\n    see the corresponding section in the narrative documentation:\n    :ref:`svm_kernels`.\n\n    .. The narrative documentation is available at http://scikit-learn.org/\n\n    Parameters\n    ----------\n    C : float, optional (default=1.0)\n        Penalty parameter C of the error term.\n\n    kernel : string, optional (default='rbf')\n         Specifies the kernel type to be used in the algorithm.\n         It must be one of 'linear', 'poly', 'rbf', 'sigmoid', 'precomputed' or\n         a callable.\n         If none is given, 'rbf' will be used. If a callable is given it is\n         used to precompute the kernel matrix.\n\n    degree : int, optional (default=3)\n        Degree of the polynomial kernel function ('poly').\n        Ignored by all other kernels.\n\n    gamma : float, optional (default=0.0)\n        Kernel coefficient for 'rbf', 'poly' and 'sigmoid'.\n        If gamma is 0.0 then 1/n_features will be used instead.\n\n    coef0 : float, optional (default=0.0)\n        Independent term in kernel function.\n        It is only significant in 'poly' and 'sigmoid'.\n\n    probability: boolean, optional (default=False)\n        Whether to enable probability estimates. This must be enabled prior\n        to calling `fit`, and will slow down that method.\n\n    shrinking: boolean, optional (default=True)\n        Whether to use the shrinking heuristic.\n\n    tol : float, optional (default=1e-3)\n        Tolerance for stopping criterion.\n\n    cache_size : float, optional\n        Specify the size of the kernel cache (in MB)\n\n    class_weight : {dict, 'auto'}, optional\n        Set the parameter C of class i to class_weight[i]*C for\n        SVC. If not given, all classes are supposed to have\n        weight one. The 'auto' mode uses the values of y to\n        automatically adjust weights inversely proportional to\n        class frequencies.\n\n    verbose : bool, default: False\n        Enable verbose output. Note that this setting takes advantage of a\n        per-process runtime setting in libsvm that, if enabled, may not work\n        properly in a multithreaded context.\n\n    max_iter : int, optional (default=-1)\n        Hard limit on iterations within solver, or -1 for no limit.\n\n    random_state : int seed, RandomState instance, or None (default)\n        The seed of the pseudo random number generator to use when\n        shuffling the data for probability estimation.\n\n    Attributes\n    ----------\n    `support_` : array-like, shape = [n_SV]\n        Index of support vectors.\n\n    `support_vectors_` : array-like, shape = [n_SV, n_features]\n        Support vectors.\n\n    `n_support_` : array-like, dtype=int32, shape = [n_class]\n        number of support vector for each class.\n\n    `dual_coef_` : array, shape = [n_class-1, n_SV]\n        Coefficients of the support vector in the decision function.         For multiclass, coefficient for all 1-vs-1 classifiers.         The layout of the coefficients in the multiclass case is somewhat         non-trivial. See the section about multi-class classification in the         SVM section of the User Guide for details.\n\n    `coef_` : array, shape = [n_class-1, n_features]\n        Weights asigned to the features (coefficients in the primal\n        problem). This is only available in the case of linear kernel.\n\n        `coef_` is readonly property derived from `dual_coef_` and\n        `support_vectors_`\n\n    `intercept_` : array, shape = [n_class * (n_class-1) / 2]\n        Constants in decision function.\n\n    Examples\n    --------\n    >>> import numpy as np\n    >>> X = np.array([[-1, -1], [-2, -1], [1, 1], [2, 1]])\n    >>> y = np.array([1, 1, 2, 2])\n    >>> from sklearn.svm import SVC\n    >>> clf = SVC()\n    >>> clf.fit(X, y) #doctest: +NORMALIZE_WHITESPACE\n    SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0, degree=3,\n        gamma=0.0, kernel='rbf', max_iter=-1, probability=False,\n        random_state=None, shrinking=True, tol=0.001, verbose=False)\n    >>> print(clf.predict([[-0.8, -1]]))\n    [1]\n\n    See also\n    --------\n    SVR\n        Support Vector Machine for Regression implemented using libsvm.\n\n    LinearSVC\n        Scalable Linear Support Vector Machine for classification\n        implemented using liblinear. Check the See also section of\n        LinearSVC for more comparison element.\n\n    "

However, I get a parser error

<string>:9: (ERROR/3) Unknown interpreted text role "ref".
<string>:17: (SEVERE/4) Unexpected section title.

Parameters
----------
Traceback (most recent call last):

  File "<ipython-input-22-2ceadc2dc730>", line 1, in <module>
    publish_string(SVC.__doc__)

  File "C:\Anaconda3\lib\site-packages\docutils\core.py", line 414, in publish_string
    enable_exit_status=enable_exit_status)

  File "C:\Anaconda3\lib\site-packages\docutils\core.py", line 662, in publish_programmatically
    output = pub.publish(enable_exit_status=enable_exit_status)

  File "C:\Anaconda3\lib\site-packages\docutils\core.py", line 217, in publish
    self.settings)

  File "C:\Anaconda3\lib\site-packages\docutils\readers\__init__.py", line 72, in read
    self.parse()

  File "C:\Anaconda3\lib\site-packages\docutils\readers\__init__.py", line 78, in parse
    self.parser.parse(self.input, document)

  File "C:\Anaconda3\lib\site-packages\docutils\parsers\rst\__init__.py", line 172, in parse
    self.statemachine.run(inputlines, document, inliner=self.inliner)

  File "C:\Anaconda3\lib\site-packages\docutils\parsers\rst\states.py", line 170, in run
    input_source=document['source'])

  File "C:\Anaconda3\lib\site-packages\docutils\statemachine.py", line 239, in run
    context, state, transitions)

  File "C:\Anaconda3\lib\site-packages\docutils\statemachine.py", line 460, in check_line
    return method(match, context, next_state)

  File "C:\Anaconda3\lib\site-packages\docutils\parsers\rst\states.py", line 1135, in indent
    elements = self.block_quote(indented, line_offset)

  File "C:\Anaconda3\lib\site-packages\docutils\parsers\rst\states.py", line 1150, in block_quote
    self.nested_parse(blockquote_lines, line_offset, blockquote)

  File "C:\Anaconda3\lib\site-packages\docutils\parsers\rst\states.py", line 282, in nested_parse
    node=node, match_titles=match_titles)

  File "C:\Anaconda3\lib\site-packages\docutils\parsers\rst\states.py", line 195, in run
    results = StateMachineWS.run(self, input_lines, input_offset)

  File "C:\Anaconda3\lib\site-packages\docutils\statemachine.py", line 239, in run
    context, state, transitions)

  File "C:\Anaconda3\lib\site-packages\docutils\statemachine.py", line 460, in check_line
    return method(match, context, next_state)

  File "C:\Anaconda3\lib\site-packages\docutils\parsers\rst\states.py", line 2720, in underline
    source=src, line=srcline)

  File "C:\Anaconda3\lib\site-packages\docutils\utils\__init__.py", line 235, in severe
    return self.system_message(self.SEVERE_LEVEL, *args, **kwargs)

  File "C:\Anaconda3\lib\site-packages\docutils\utils\__init__.py", line 193, in system_message
    raise SystemMessage(msg, level)

SystemMessage: <string>:17: (SEVERE/4) Unexpected section title.

Parameters
----------

All I really want is a way to convert docstrings of sklearn objects into HTML without having to write a full fledged parser on my own. If there is no way to do so, then any suggestions for writing the parser are welcome. Thanks in advance.

0

There are 0 answers