Defining variables: problems with program logical flow

97 views Asked by At

I'm currently trying to implement the following algorithm for finding max.-margin decision boundaries (for use in an SVM):

let D = {(x1, y1), (x2, y2), . . . , (xl, yl )} ⊂ Rn ×{+1,−1}
r ← max{|x| | (x, y) ∈ D}
q ← 1000
let w∗ and b∗ be undefined
Construct X according to (6.32) using D.
for each b ∈ [−q, q] do
    Construct c according to (6.33) using b.
    w ← solve(I, 0, X, c)
    if (w is defined and w∗ is undefined) or (w is defined and |w| < |w∗|) then
            w∗ ← w
            b∗ ← b
    end if
end for
if w∗ is undefined then
    stop constraints not satisfiable
else if |w|∗ > q/r then
    stop bounding assumption of |w| violated
end if
return (w∗, b∗)

The only variable we're concerned about is w*, which is supposed to be a matrix (from the package CVXOPT, not a numpy array -- although CVXOPT matrices can be created from numpy arrays). The main problem happens when I check whether w* is defined.

If I initialize it to None on line 4, then it works fine for the "w* is undefined" comparison in line 9, but the second comparison "|w| < |w*|", which compares the norm of w and w*, fails since you can't just run np.linalg.norm() on a None variable.

If I initialize w* to some junk matrix (say [40,40]) and remember that as my placeholder value for undefined, substituting in that junk matrix down the line where I need to check if w* is undefined, then I also run into problems -- CVXOPT won't allow for direct comparison of matrices to see if they're equal or not. I could cycle through and compare each element, but that would hurt performance significantly (esp. since I may use this for large datasets).

Here's my a Gist with my faulty code: http://tinyurl.com/pr5j44w

Note that wopt is the problematic variable w*.

1

There are 1 answers

2
jonrsharpe On BEST ANSWER

Change your test; rather than var_defined(wopt), use wopt is not None. Python's lazy and/or evaluation will then mean that the np.linalg is only applied if wopt has been defined.

Your current var_defined function is useless; if a name actually hasn't been assigned, you will get a NameError when you try to call the function on that name. Within the function, if it does get called without error, var is always defined (as it is the argument), so it will always return 1. You aren't actually checking whether the argument is None.

Also, functions like that should return True or False rather than 0 and 1, and can be tested directly:

if not bool_func(x): # no == 0