I want to %timeit
a function in Jupyter.
Generate Data
df["One"] = range(1,1001)
df["Two"] = range(2000, 3000)
df["Three"] = range(3000, 4000)
df.set_index(["One"], drop = True, inplace = True)
Set up function
def test_iterrows(df):
for index, row in df.iterrows():
if (row["Three"] & 1 == 0):
df.loc[index, "Three"] = "Even"
else:
df.loc[index, "Three"] = "Odd"
print df.head()
gc.collect()
return None
When I run test_iterrows(df)
, I get:
Two Three
One
1 2000 Even
2 2001 Odd
3 2002 Even
4 2003 Odd
5 2004 Even
Fine. The function works. However, when I do %timeit test_iterrows(df)
, I get an error:
<ipython-input-29-326f4a0f49ee> in test_iterrows(df)
13 def test_iterrows(df):
14 for index, row in df.iterrows():
---> 15 if (row["Three"] & 1 == 0):
16 df.loc[index, "Three"] = "Even"
17 else:
TypeError: unsupported operand type(s) for &: 'str' and 'int'
What is going on here? My (probably wrong) interpretation is, that I apparently can't %timeit
functions that contain %
.
What is going on here?
%timeit
repeatedly executes the statement and the function changes thedf
in-place. Note that I get the same exception when I just call the function twice:You probably should pass in a
copy
, although that would slightly "bias" the timings because it also times the time it takes to copy it:Also I'm not quite sure what the
gc.collect()
call is supposed to do there, becausegc.collect
just garbage collects objects that can't be garbaged by normal means because of reference cycles.