Best way to cross matching large datasets


Posted on 16th Feb 2014 07:03 pm by admin

Hi,

Im running a script where am I cross matching about 200 000 data sets with each other. Each data set consists of 8 parameters and I want to count all datasets which have similar or the same parameters for each data set.

Right now, I am doing the matching via a MySql query which im calling about 200 000 times. The problem is that using a query is extremely expensive… it takes up to 2 hours until the script is done. So I am wondering if there is a better method to cross match data sets and if some of could help me find a better solution.

While researching I found out that arrays may be a faster alternative to queries. And so far, I identified 3 possible ways for cross matching:

1. nested foreach () loops
foreach($array as ar1)
foreach($array as ar2)
if ($ar1[0] == $ar1[0])….

2. Using an Array_map with Callback function, so that i would have only one "hand coded" loop
foreach($array as arr)
if ($arr[0] == $parameter)….

3. Array walk where i could save one "hand coded" loop as well.

Theoretically would be the best/fastest way to go about it? Can Anyone tell me what technically the difference between those 3 ways is? And which one is the better approach or if there other alternatives to them?

I am thankful for any advice that helps me reduce execution time!

No comments posted yet

Your Answer:

Login to answer
343 Like 48 Dislike
Previous forums Next forums
Other forums

Reading Most Recent CSV File in Directory
I thought I had wrapped this project up, but found out that the program I use to FTP a csv file to m

Join Query Help
Hi all,

I am having problems with the below code, which we shall call 'my first join query'!

problem with php mysql query
Hi guy's...

I'm totally lost here..because don't have any idea how to make a query for grab r

remove innitial

and

tags
i am using tiny_mce as a text editor for my CMS.
buy now the problem is it add <p>

Help! refer to a friend script with captcha code
Hi guys, I am posting on here in desperate need for some help with an ongoing search I have been doi

simple ping code
been searchin the site/web and found code thats simple but doesnt work.

I have a personal we

Does deleting the spmlog directory critical?
Hi Everyone,

Please, hope you could help me. We're having problems with the SAP backup. I

Async WSAConnect failed on XP with error code = 2 ("File not found")
Hi all,

I have very strange bug, please help me if you can.

It is reproduced o

Accessing element of object array
Hello

My object looks like this:


Array ( [0] => User Object ( [id] =>

Including calander to page - will not show other months than current??
im trying to add an existaing calander onto a profile page by using Code: <?php include "

Sign up to write
Sign up now if you have flare of writing..
Login   |   Register
Follow Us
Indyaspeak @ Facebook Indyaspeak @ Twitter Indyaspeak @ Pinterest RSS



Play Free Quiz and Win Cash