View Poll Results: How much accuracy do you think the system has?

Voters
8. You may not vote on this poll
  • more than 90%

    0 0%
  • 70-90%

    2 25.00%
  • 50-90%

    0 0%
  • 40-50%

    0 0%
  • very poor(<40%)

    6 75.00%
+ Reply to Thread
Page 1 of 2 12 LastLast
Results 1 to 10 of 12

Thread: New antibot measure, need reviews.

  1. #1
    Teensweb is offline x10 Lieutenant Teensweb is an unknown quantity at this point
    Join Date
    May 2008
    Posts
    352

    New antibot measure, need reviews.

    Hi guys, I've been trying to develop a new anti-bot measure based on simple image recognition.
    For example. The page will throw up an image and ask you to identify what it is.
    Here
    is a live demo.
    The currently included objects are:
    chair, clouds, trees, ship, river, clock, books icon, car, tv, plane and mountain.
    What do you think about the efficiency of this system? The images are generated randomly from the web, so accuracy is not 100%...

  2. #2
    cybrax's Avatar
    cybrax is offline x10 Elder cybrax is on a distinguished road
    Join Date
    Aug 2009
    Location
    UK
    Posts
    699

    Re: New antibot measure, need reviews.

    Works well, as for accuracy it's hard to say after just playing with it for a minute.
    I suppose it would depend where the images and descriptions are obtained from, pulling the results out of a random google search page may not give as accurate descriptions as using Flickr or Mourgefile would.

    Are you going to share the source code with the community here?
    The code must flow.
    Project 157: Latest UK Jobs direct to your mobile phone
    New Domain under construction: Lovelogic.net
    home for some new projects that we can't keep here ;)


  3. #3
    lemon-tree's Avatar
    lemon-tree is offline x10 Minion lemon-tree has a spectacular aura about
    Join Date
    Nov 2007
    Posts
    1,420

    Re: New antibot measure, need reviews.

    The problem with this sort of captcha is that it is very much dependant upon what a user perceives an image to be of. For example, this image:

    It has more than just one possibility: Boat, Ship, Ocean, Sea, Tree and any other number of derivations. I entered boat and got an invalid captcha; so what one person calls a boat may be called a ship by another. Essentially, with this sort of captcha, in which there is any form of ambiguity as to what the answer is, will only act to frustrate your users.

    What you are trying to do here is reinvent the wheel when the are already viable systems that have been shown to be effective. Additionally, image recognition captcha has been positively shown to be insecure in the past. For example, I could index the first thousand or more images that are returned by a search for 'Ship' and then every time you show an image of a ship I could break the system first time. If I don't have an image than I may then be able to make a good guess based on the content of the images, i.e if the image is lots of blue with a white shape then I may guess 'Ship' or 'Cloud'. If this process is repeated for a whole range of phrases then your system becomes effectively useless, as eventually you will show an image that I have data for.

    This is why randomly generated images are so popular for captcha, as there is no way to predict what will show up. The only way to break these is to decode it. Relying upon the chance that I don't have that image is not a good system for a captcha.

  4. #4
    hazar90's Avatar
    hazar90 is offline x10Hosting Member hazar90 is an unknown quantity at this point
    Join Date
    Nov 2010
    Location
    Serbia
    Posts
    37

    Re: New antibot measure, need reviews.

    Quote Originally Posted by lemon-tree View Post
    The problem with this sort of captcha is that it is very much dependant upon what a user perceives an image to be of. For example, this image:

    It has more than just one possibility: Boat, Ship, Ocean, Sea, Tree and any other number of derivations. I entered boat and got an invalid captcha; so what one person calls a boat may be called a ship by another. Essentially, with this sort of captcha, in which there is any form of ambiguity as to what the answer is, will only act to frustrate your users.

    What you are trying to do here is reinvent the wheel when the are already viable systems that have been shown to be effective. Additionally, image recognition captcha has been positively shown to be insecure in the past. For example, I could index the first thousand or more images that are returned by a search for 'Ship' and then every time you show an image of a ship I could break the system first time. If I don't have an image than I may then be able to make a good guess based on the content of the images, i.e if the image is lots of blue with a white shape then I may guess 'Ship' or 'Cloud'. If this process is repeated for a whole range of phrases then your system becomes effectively useless, as eventually you will show an image that I have data for.

    This is why randomly generated images are so popular for captcha, as there is no way to predict what will show up. The only way to break these is to decode it. Relying upon the chance that I don't have that image is not a good system for a captcha.
    You're right about that.

  5. #5
    Teensweb is offline x10 Lieutenant Teensweb is an unknown quantity at this point
    Join Date
    May 2008
    Posts
    352

    Re: New antibot measure, need reviews.

    Are you going to share the source code with the community here?
    Sure I will, but before that I need to know if it's worth it, that's why I set up the poll. I expected more response, is it because it's a public poll?
    Last edited by Teensweb; 02-07-2011 at 01:08 AM.

  6. #6
    warlordste's Avatar
    warlordste is offline x10 Elder warlordste is an unknown quantity at this point
    Join Date
    Sep 2007
    Location
    Wigan
    Posts
    653

    Re: New antibot measure, need reviews.

    As someone else posted it relies on what the image means to you also another thing is spellings one of them was a pic of a ship now i would call it a photo but it might of been photograph personally without having writing at the bottom like whats in the middle of the picture or what color is the ship i don't think its going to work inless you do somthing like this other wise you will prob loose vistors
    "It's time to prove to your friends that you're worth a damn. Sometimes that means dying, sometimes it means killing a whole lot of people."- sin city "You either die a hero or you live long enough to see yourself become the villain" - TDK



  7. #7
    Livewire's Avatar
    Livewire is offline Abuse Compliance Officer Livewire is a glorious beacon of lightLivewire is a glorious beacon of light
    Join Date
    Jun 2005
    Location
    Behind a keyboard.
    Posts
    8,998

    Re: New antibot measure, need reviews.

    The problem is the ease of defeating it; that makes its accuracy low, but also because of the number of possible answers for each image. For instance, should I put a picture of a boeing 747 up in that captcha, what should it take as a valid answer?

    The list I can see:
    Airplane
    Jet
    Jumbo Jet
    Jumbo-Jet
    Queen of the Skies
    Wide-body Commercial Airliner
    Boing 747
    Boeing 747

    Which one should it accept, ignoring that there's more than what I've stated here? The worse news is if you accept -every- answer, then it's easier for the bots to guess, which defeats the purpose. Captchas that are randomly generated can fare better here as there's 1 solution and only 1.


    Plus, we run into the issue of image-count. If we take a standard 7 character US 26 letter alphabet randomly generated captcha, we have approximately (meaning I actually checked it in a calculator) 8,031,810,176 different possible combinations. Add to that each image is actually randomly generated on the fly to skew and distort the letters so a bot can't read it and a human can, and we've got a letter combination that is virtually unguessable by a bot. We can't exactly store that many pictures for the captcha, along with all their possible answers. Even if we compared it to a 3 character captcha, we'd still need 17,576 different pictures.




    tl;dr?

    Drawbacks: Not nearly enough combinations when compared to a standard US Alphabet captcha, and too many possible solutions for each image.
    Last edited by Livewire; 02-07-2011 at 01:50 AM.


    TOS breakers will be suspended regardless of race, creed, national origin, hair color, or favorite food. Thanks for your understanding!

  8. #8
    Teensweb is offline x10 Lieutenant Teensweb is an unknown quantity at this point
    Join Date
    May 2008
    Posts
    352

    Re: New antibot measure, need reviews.

    @warlordste:
    As you said about photo and photograph, I had already solved that problem: if you type trees, or tree if a tree comes up; or plane instead of airplane; it'll still work.
    P.S: I had to read your post twice to understand it correctly, it would be very nice of you if you could use full stops wherever required...

    @Livewire:
    That's an important issue that you pointed out. I am more of a mathematician than a programmer and my very reason of building this system is that captchas are becoming outdated,even I have a pluggin in firefox that'll scan captcha for certain sites I visit and fill it up for me, then what less do you expect of spammers? I guess that ruins your probability calculation(could've just written 26^7).

    But what sort of scanning program can you write for recognizing objects in an image (forget objects, can someone at least show me how they would make the computer recognize just a book from a bunch of random images?)
    Of course, indexing all the images of a search is an obvious breach, but how many such images will people index, and how do people know which search engine is used? I am not sure about this part but if people would do it, then I'll probably give up with this thing.
    And about storing that much images, that's a real issue, but who's talking about "us" storing all the images? I am just getting images stored on the web by others!
    But no system is perfect and since this is my first venture in actually building something for the web, this one's far from perfect.
    The disadvantages I have come across so far are:
    1. The accuracy of the images generated from the web ( I want statistics from others that that's one of the main reasons of posting it here )-:
    -Have you tested that yourself ? ( I am not claiming anything but I just want to make sure...)
    I'm still working on improving that.
    2. As you pointed out, the various possible answers of the same image:
    lemon-tree was also right as to what part of the image should be considered.
    But still, I just want to ask, what part will you consider? ( That'll decide if i should work more on this or not, again one of the reasons for me asking reviews...)
    Well, people who are more like computers will be confused I guess!

    But still, won't the problem be solved ( to some extent ) if a list of possible objects is provided? (I don't plan having more than 20, for now )

    If anyone else finds more drawbacks, you are welcome...
    Last edited by Teensweb; 02-07-2011 at 07:07 AM.

  9. #9
    Salvatos's Avatar
    Salvatos is offline x10 Lieutenant Salvatos is an unknown quantity at this point
    Join Date
    Jun 2006
    Location
    Québec, Canada
    Posts
    271

    Re: New antibot measure, need reviews.

    I'm not sure if you were actually asking the question, but yes, there are programs that will identify objects on a picture using general shapes and colors. One of my friends worked on one last summer, I believe it was to count how many cars were on a picture. As far as I know, that's also how Google Maps censors faces and license plates. It's not perfect, but it can definitely help beat your antibot quickly.

  10. #10
    Teensweb is offline x10 Lieutenant Teensweb is an unknown quantity at this point
    Join Date
    May 2008
    Posts
    352

    Re: New antibot measure, need reviews.

    I'm not sure if you were actually asking the question, but yes, there are programs that will identify objects on a picture using general shapes and colors. One of my friends worked on one last summer, I believe it was to count how many cars were on a picture. As far as I know, that's also how Google Maps censors faces and license plates. It's not perfect, but it can definitely help beat your antibot quickly.
    Yep, I've heard about neural networks, but its definitely more easier to write programs that identifies word captchas, don't you think?
    What I'm saying is, they are getting outdated, I just thought about this idea and went ahead to write it because it took only 10 lines of code in php, well I guess I can happily drop the idea due to lack of accuracy, as most say over here. Maybe someone else can improve upon this or even come up with some new idea...
    And about reinventing wheel, I am trying to "improve" the wheel, which can be useful, you know...
    BTW, most sci-fi writers even think that wheels have got outdated and we should switch to hovercrafts or something!
    Last edited by Teensweb; 02-07-2011 at 10:36 PM.

+ Reply to Thread
Page 1 of 2 12 LastLast

Similar Threads

  1. More reviews
    By Wizet in forum Review My Site
    Replies: 0
    Last Post: 06-11-2008, 04:59 PM
  2. How do you measure the success of a website?
    By dwd2000 in forum Crossfire
    Replies: 16
    Last Post: 05-06-2008, 09:28 AM
  3. Replies: 2
    Last Post: 08-07-2007, 11:44 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
x10hosting free hosting for the masses
dedicated servers