Indeed -- Osquery, and the commercial precursor Tanium and the native Microsoft equivalents, can help when you've identified one incident and want to find active repeats. Ex: "I just cleaned evil.exe from this box, are any other hosts running the same process? What other processes do they have in common?". I've seen hunting use cases as well.
Your example is why we're building our visual playbook system. In this case, query your network scanner for open ports, and then feed the identified hosts into more targeted osquery calls.
Thank you for the reply; however, I am still having a hard time understanding how exactly Osquery works under the hood. How does it communicate with other hosts? With what does it query for information once it reaches a remote node? What kind of overhead does it have in terms of network transfer when querying? Maybe I am completely off base and am mistaking how this works... Is Osquery set up on every individual host and I query for that information remotely using whatever tools I have at my disposal?
You can configure osquery to execute periodic queries (scheduled queries) of all kinds: computing md5 of your binaries and other files, taking a snapshot of sockets/connections per process, and so on.
By default, osquery uses glog, which means it'll output the results to a local file that you can ship anywhere you want. There's also logging plugins to help you push the results of scheduled queries to other systems.
Once you have that data flowing through your pipelines you can start doing security/anomaly detection on things.
But do you need an installation of osquery on the remote machines too? Or some kind of remote agent? Or does it just try to login to each remote machine over e.g. SSH?
Your example is why we're building our visual playbook system. In this case, query your network scanner for open ports, and then feed the identified hosts into more targeted osquery calls.