K-Means Clustering a list of US addresses based on drive time

Question

K-Means Clustering a list of US addresses based on drive time

1.1k views Asked by elied0327 At 15 June 2015 at 20:53

I have 8 traveling consultants that need to visit 155 groups across the continental united states. Is there a way to find the optimal 8 regions based of drive time using k-means clustering? I see there are some methods implemented already for other data sets, but they are not based off drive time. How will I need to manipulate my data set to make it usable?

Thank you in advance for any feedback. I am by no means a great coder, I have taken only a few introductory courses back in college.

Original Q&A

There are 2 answers

knb On 16 June 2015 at 09:34

This looks more like an integer optimization problem. It has nothing to do with clustering.

Reminds me of the case study "Assigning Regions to Sales Representatives [SRs] at Pfizer Turkey" by Murat Köksalan and Sakine Batun, INFORMS Transactions on Education 9(2), p.70-71, January 2009. http://pubsonline.informs.org/doi/abs/10.1287/ited.1090.0021ca .

I had to solve a simplified version of the problem in a MOOC recently.

"Since the SRs have to visit the MDs in their offices, it is important to minimize the total distance traveled by the SRs. This is the objective function. Each SR has an office in a certain brick, called their "center brick". We will compute the total distance traveled by an SR as the sum of the distances between the center brick and every other brick in that SR's territory."

You can optimize this for certain criteria. I cannot give any more details here because it is quite complicated.

**andrew** · Accepted Answer · 2015-06-15T23:52:18+00:00

I think you are looking for "path planning" rather than clustering. The traveling salesman problem comes to mind

If you want to use clustering to find the individual regions you should find the coordinates for each location with respect to some global frame. One example would be using Latitude and longitude coordinates. Create an array X thats 155x2 where each row is a destination with the columns lat,long Then simply run matlab's kmeans something like

[idx,C] = kmeans(X,8);

should work nicely. This should be enough to get you started.

One issue with this approach is that it will group the sites by geographical location. Which isn't always the same as shortest travel time. For instance,

distance from (site A, site B) = 0.5 miles
distance from (site A, site C) = 2.0 miles

but getting from A-B requires going around a river and actual travel distance is 10 miles, whereas A-C is realistically 2.5 miles, clearly A-C is the better choice, but using global position alone wouldn't take this into account

TechQA.

K-Means Clustering a list of US addresses based on drive time

There are 2 answers

Related Questions in EXCEL

Related Questions in MATLAB

Related Questions in CLUSTER-ANALYSIS

Related Questions in K-MEANS

Related Questions in GEO

Popular Questions

Popular Tags

Trending Questions