It really doesn't take much effort to detect the majority of scrapers. Usually you do so by monitoring patterns of any given IP.
Is each request a profile page incremented (/users/1, /users/2, etc)
or dozens of requests a minute (faster than a typical user would read)?
Is static content (particularly images and CSS) being downloaded too or just the HTML content?
Sometimes the referrer HTTP header can give clues too - though you have to be careful there as that's as unreliable as the user agent header.
However if you're really paranoid about scrapers you can also throw in some honeypots. eg a fake user (/users/13) which is a user account that doesn't exist so that page wouldn't have any links from within your site. ie you only reach it if you're incrementing through the user IDs. Or perhaps a link within your HTML which doesn't render so it's only reachable via automated scripts that don't check what links are rendered inside the display view. Anyone that gets ensnared in your honeypot could then be put on a temporary IP blacklist. Though the danger of doing this is you accidentally blacklist good crawlers if you're not careful about setting appropriate robots rules.
Is each request a profile page incremented (/users/1, /users/2, etc)
or dozens of requests a minute (faster than a typical user would read)?
Is static content (particularly images and CSS) being downloaded too or just the HTML content?
Sometimes the referrer HTTP header can give clues too - though you have to be careful there as that's as unreliable as the user agent header.
However if you're really paranoid about scrapers you can also throw in some honeypots. eg a fake user (/users/13) which is a user account that doesn't exist so that page wouldn't have any links from within your site. ie you only reach it if you're incrementing through the user IDs. Or perhaps a link within your HTML which doesn't render so it's only reachable via automated scripts that don't check what links are rendered inside the display view. Anyone that gets ensnared in your honeypot could then be put on a temporary IP blacklist. Though the danger of doing this is you accidentally blacklist good crawlers if you're not careful about setting appropriate robots rules.