I'm new on trying to optimize with cython and I need some help because I know I can optimize more but I don't know how.
Firstly what I did was declare the variables in C, however a lot of them are objects, and I don't know how to declare that.
Then I have the class, in the HTML I can see every time the code wants to get an object of the class it is slow, so I wanna optimize that too, and I don't know how to do it :(
import math
import random
def class City:
def __init__(self, city_name='NoName', posx=0, posy=0):
self.name = city_name
self.x = posx
self.y = posy
def distance_between_cities( object C1, object C2 ):
"""Compute distance between two cities"""
dx = C1.x - C2.x
dy = C1.y - C2.y
return math.sqrt( dx*dx + dy*dy )
def read_cities_from_file ( str filename ):
"""Read list of Cities from text file"""
cdef list Cities= [] # List of Empty Cities
cdef list R = []
with open(filename) as file:
for line in file:
R= line.split()
Cities.append ( City( R[0], float(R[1]), float(R[2]) ) )
return Cities
def path_distance( object Cities ):
"""Compute total distance of the path traversing all cities in order"""
cdef int i = 0
cdef double D = 0
for i in range( len(Cities) ):
D = D + distance_between_cities ( Cities[i],
Cities[i+1 if i+1<len(Cities) else 0])
return D
cdef int closerCity( object City, object Cities ):
"""Compute position of the city C in the list of Cities that is closer to City"""
cdef float minDist = 2*1000*1000
cdef int minIndex= 0
cdef int i = 0
cdef double dx = 0
cdef double dy = 0
cdef double dist = 0
for i in range(len(Cities)):
c = Cities[i]
dx = City.x - c.x
dy = City.y - c.y
dist = dx*dx + dy*dy
if dist < minDist:
minDist = dist
minIndex= i
return minIndex
def GoodPath( object Cities ):
"""Generate a path with small total distance using greedy algorithm"""
cdef list NotVisited = []
cdef int len_C = len(Cities)
cdef int i = 0
for i in range(len_C):
NotVisited.append(Cities[i])
Path= [ Cities[0] ] # Start path with first city
del NotVisited[0]
while len(NotVisited) > 0:
pos = closerCity( Path[-1], NotVisited )
Path.append( NotVisited[pos] )
del NotVisited[pos]
return Path
if __name__ == "__main__":
import argparse as arg
parser = arg.ArgumentParser(prog='ARGUMENTS', usage='%(prog)s [options]')
parser.add_argument("input", type=str, help="File containing list of Cities")
args = parser.parse_args()
ListOfCities= read_cities_from_file ( str(args.input) )
GoodList = GoodPath( ListOfCities )
print ( "Initial Distance =", path_distance( ListOfCities ),
" Good Distance =", path_distance( GoodList) )
This is what I get from the HTML when I cythonize: Part 1 of the HTML Part 2
Any help will be appreciated
Thank You!
The complexity of your algorithm is
O(n^2)
(withn
the number of cities). Indeed, you walk through all the cities and for each city you iterate over most of the cities again (O(n)
cities). Note also that item deletion is algo performed inO(n)
time. The result can be computed with a more efficientO(n log n)
algorithm:You can build a K-d tree structure, or more specifically a Quad-tree structure in
O(n log n)
time and then find the closest city of any other city inO(log n)
time with this structure. Additionally, it is better to change the type ofNotVisited
to aset
rather than alist
to avoid a slow list item deletion.Apart from that, I advise you to replace
object
-typed variables with lower-level types (eg. arrays or tuples of double/int) so that Cython can generate a faster C code.Note that your method will often not find the overall shortest path and may not even find a quite small path. There are better algorithms, but they are more expensive.