Best way to cross matching large datasets
Posted on
16th Feb 2014 07:03 pm by
admin
Hi,
Im running a script where am I cross matching about 200 000 data sets with each other. Each data set consists of 8 parameters and I want to count all datasets which have similar or the same parameters for each data set.
Right now, I am doing the matching via a MySql query which im calling about 200 000 times. The problem is that using a query is extremely expensive… it takes up to 2 hours until the script is done. So I am wondering if there is a better method to cross match data sets and if some of could help me find a better solution.
While researching I found out that arrays may be a faster alternative to queries. And so far, I identified 3 possible ways for cross matching:
1. nested foreach () loops
foreach($array as ar1)
foreach($array as ar2)
if ($ar1[0] == $ar1[0])….
2. Using an Array_map with Callback function, so that i would have only one "hand coded" loop
foreach($array as arr)
if ($arr[0] == $parameter)….
3. Array walk where i could save one "hand coded" loop as well.
Theoretically would be the best/fastest way to go about it? Can Anyone tell me what technically the difference between those 3 ways is? And which one is the better approach or if there other alternatives to them?
I am thankful for any advice that helps me reduce execution time!
No comments posted yet
Your Answer:
Login to answer
343
48
Other forums
writing a screen scraper
Hello,
I'm writing a screen scraper application and want to be able to get absolute addresses
Writting a script to arrange images........ need some help
Ok so here is the link
http://hmtotc.com/dev/projects/vrassociates/jeweler_dev/admin/index.ph
Run function every 5 mins ??
I have a function PostMessage()
How can I run it every 5 mins ??
To change the name of label on SAP screen XK02.
Hi All,
Can one suggest me how to change the label of an input field of a sap standard s
cstdatomic (c++0x std::atomic) / g++ 4.4
Hello,
I'm trying to use cstdatomic (std::atomic in the upcoming c++0x standard) in g++
Adding to the next element in a multidimensional array
Hi, I'm trying to add a value to $node->field_spaces['nid'] where x is the next available spo
PHP Upload issue
Hi guys,
I have stumble across an interesting issue with my script and is doing my head in.
Oracle Connectivity
Hi Every One,
Can we access SAP from oracle database.If it possible then please spec
How to Handle more than one submit button in single form?
Hi
I have one PHP file which contains one Form. In this form there are two Submit type Buttons
Parse multirow HTML table
Hello all,
I have a site I am working on. Its a sports site and I am trying to add stats to a DB