Remove duplicates

Discussion in 'Scripts, 3rd Party Apps, and Programming' started by c740015, Feb 5, 2010.

  1. c740015

    c740015 New Member

    Messages:
    111
    Likes Received:
    0
    Trophy Points:
    0
    Hi im Trying to remove duplicate entries from a text file i load it will grab a string and display the array
    also i asked it to save the array to a textfile

    Problem is the text file has duplicates in and i cant get rid of them no matter what i do..

    Im really stuck at this part i would love some help..

    Here is my code

    Code:
    <?
    $loadfile = file_get_contents("codes.txt");
     
    $expression = "#[A-Z0-9]{8}([-_][A-Z0-9]{8}){3}#";
    $resultat = Array("#[A-Z0-9]{8}([-_][A-Z0-9]{8}){3}#");
    $file="test.txt";
     
    preg_match_all($expression, $loadfile, $resultat, PREG_PATTERN_ORDER);
    
    echo "<pre>";
       print_r(array_unique($resultat));
    echo "</pre>";
     
    ob_start();
    print_r(array_unique($resultat));
    $result = ob_get_contents();
    ob_end_clean();
    
    $handle=fopen($file, "w"); 
    fwrite($handle, $result);
    fclose($handle);
     
    ?>
    This is the out put of the code

    Array
    (
    [0] => Array
    (
    [0] => 00000000-00000000-00000000-00000001
    [1] => 00000000-00000000-00000000-0000000B
    [2] => 00000000-00000000-00000000-0000000F
    [3] => 00000000-00000000-00000000-00000016
    [4] => 00000000-00000000-00000000-00010001
    [74] => 00000000-00000000-00000000-00000001
    [75] => 00000000-00000000-00000000-0000000B
    [76] => 00000000-00000000-00000000-0000000F
    [77] => 00000000-00000000-00000000-00000001
    [78] => 00000000-00000000-00000000-0000000B
    [79] => 00000000-00000000-00000000-0000000F
    [80] => 00000000-00000000-00000000-00000016

    As you can see its duplicating the value.
    either to remove the key and kill duplicates that way or a method to just kill the duplicates..

    Im really stuck :(

    Any help would be amazing.
     
  2. misson

    misson Community Paragon Community Support

    Messages:
    2,572
    Likes Received:
    72
    Trophy Points:
    48
    The issue with your code is that $resultat is a multidimensional array with a single value. The array_unique is called on the wrong value; a $resultat = $resultat[0]; would fix this, but the performance can be improved by taking a different approach.

    PHP uses associative arrays. Store the lines as keys rather than values:

    PHP:
    function getCodes($fname) {
        
    $codes = array();
        
    $codeFile fopen($fname'r');
        if (
    $codeFile) {
            while (
    FALSE !== ($line fgets($codeFile))) {
                
    preg_match('/[A-Z0-9]{8}([-_][A-Z0-9]{8}){3}/'$line$match);
                if (
    $match) {
                    
    $codes[$match[0]] = $match[0];
                }
            }
            
    fclose($codeFile);
        }
        return 
    $codes;
    }
    You might want to set your own error handler. I've switched from file_get_contents &c to the wrapped C file I/O functions to reduce memory usage, but if the files are small enough, you don't need to make this change.

    Note that you could write this as a very simple shell script. If the file lines only contain the codes, all you'd need is
    Code:
    #!/bin/bash
    sort $1 | uniq > $2
    
     
    Last edited: Feb 6, 2010
  3. c740015

    c740015 New Member

    Messages:
    111
    Likes Received:
    0
    Trophy Points:
    0
    Nice Im gonna get on this now cheers :)..

    Im a little confused but i think i understand ur reply im new to php Im just learning..

    I do have some programming knowledge but not so much in php.

    Ill let u know how i get on :)
    Edit:
    Ugh.. Im still stuck :(

    I did try the 1st option $resultat = $resultat[0];

    Unfortunatly still displayig exactly the same information loads of duplicates

    And the Function i tried to use that also..

    and im getting this error Warning: fclose(): supplied argument is not a valid stream resource in

    :(
     
    Last edited: Feb 6, 2010
  4. slacker3

    slacker3 New Member

    Messages:
    146
    Likes Received:
    6
    Trophy Points:
    0
    does it have to be done with PHP ?

    otherwise you should really use sort/uniq, since it does exactly what you want with just one line of code

    if you don't have a linux box:
    http://www.cygwin.com/
     
  5. c740015

    c740015 New Member

    Messages:
    111
    Likes Received:
    0
    Trophy Points:
    0
    Yeah cgwin is no good for me.. I want users to input the codes into a text box then i can parse them and then remove the duplicates...

    Having linux installed on my machine is no good..

    and since the user can be puting anything in the box it really does need to grab the 32 digits.

    The only problem i am having is removing duplicates..

    :(

    Once that is done im all set.

    Bascially i have set up a text box that people drop there codes in then i just run this script to parse them.

    But i really need duplicates to be removed or it would be no use at all.

    I have spent alot of time on it reading various sites and im totally puzzled and cant think what to do now :(
     
  6. slacker3

    slacker3 New Member

    Messages:
    146
    Likes Received:
    6
    Trophy Points:
    0
  7. misson

    misson Community Paragon Community Support

    Messages:
    2,572
    Likes Received:
    72
    Trophy Points:
    48
    Where did you place the statement? After the call to preg_match_all? The following works for me
    PHP:
    $loadfile file_get_contents("codes.txt");
    $expression "#[A-Z0-9]{8}([-_][A-Z0-9]{8}){3}#";
    preg_match_all($expression$loadfile$resultatPREG_PATTERN_ORDER);
    $resultat array_unique($resultat[0]);
    array_unique is likely to be the most expensive (in terms of time) part of the script, so don't call it more than you must. Better not to use it at all.

    It's been fixed in the code; try it again.
     
    Last edited: Feb 6, 2010
    • Like Like x 1
  8. c740015

    c740015 New Member

    Messages:
    111
    Likes Received:
    0
    Trophy Points:
    0
    Nice one sorry for the late reply..

    Mission great guy thanks very much for your help worked perfectly...

    Any idea how i would insert each code into a database and pull them out again to show on a different page so i can store a database record of each code.. My goal is to have another screen where i can sort each code into catagories or some thing like that or just to store them in a databse..

    I wanna learn more.. :)

    Thanks mission rep for that...
     
  9. misson

    misson Community Paragon Community Support

    Messages:
    2,572
    Likes Received:
    72
    Trophy Points:
    48

Share This Page