But how do you do: SELECT posts.*, (SELECT COUNT(1) FROM comments WHERE post_id ...

samwillis · on June 26, 2016

I don't know about active record but with Django it's simply:

  Post.objects.all().annotate(Count('comments'))

Which produces (simplified, Django would actually explicitly select each column):

  SELECT posts.*, COUNT(comments.id) AS comments_count
  FROM posts
  LEFT OUTER JOIN comments ON (posts.id = comments.post_id) GROUP BY posts.id

Nice, easy and without 1+N queries.

sirn · on June 26, 2016

That is actually very nice and elegant!

fennecfoxen · on June 26, 2016

I've actually never used ActiveRecord myself, but when an ORM breaks down and it becomes hard to make certain queries I've had good luck with other ORMs by creating a view in the database and telling the ORM that it's a table. Obviously you can't insert into arbitrary views, but most are happy to give you an object back, with appropriate relationships to the actual database-writable objects.

I'm surprised there isn't more explicit support for this in more ORMs (things like not having 'save' methods on the ORMed classes).

_gi12 · on June 26, 2016

That's really easy.

    Post.select("posts.*, count(comments.*) as comments_count").joins(:comments)

jeltz · on June 27, 2016

That is not the same query at all. You need to do a left join and a group by for it to become the same query. Wihout a left join posts without comments wont show up in the result.

_gi12 · on June 27, 2016

You're totally right. My example is an inner join. I will admit that doing outer joins in ActiveRecord is quite painful. Here's a more realistic example, which is not nearly as pretty:

  Post.select("posts.*, COUNT(comments.*) AS comments_count").joins("LEFT JOIN comments ON comments.post_id = posts.id").group(:comments)

sirn · on June 26, 2016

Thanks for the answer!

Unfortunately, what I needed to do back then were more complex than that (say, I need to perform query on top of subquery, e.g. "SELECT ... FROM (SELECT ...) AS a HAVING ... GROUP BY ..." sort of thing) which seems to be something ActiveRecord wasn't designed for.

In the end, I solve the problem by using Arel with an ActiveModel and unwrap all the data manually.

jimktrains2 · on June 26, 2016

Wouldn't you need to do a group by in your query as well?

_gi12 · on June 27, 2016

PavlovsCat · on June 26, 2016

What's more, even with just bare prepared statements.. how do I use dynamically built SQL queries and prepared statements together? And please don't just say "don't", at least not without telling me how to achieve what I need the proper way :)

For example, let's say you have a query that gets search results, and depending on whether the visitor is logged in or not you also may want to know whether a given search result happens to be a favourited item. The way I understand it, I would have to do for example:

    $query_text = 'blah';
    if (user is logged in)
    {
        $query_text .= 'SQL pertaining to favourites';
    }
    $stmt = $dbh->prepare($query_text);
    if (user is logged in)
    {
        $stmt->bindParam(':user_id', $user_id);
    }

Or am I missing something?

krapp · on June 26, 2016

Just use separate statements. You should already know before that point whether or not the request is coming from a logged in user, so to me even the if statements are redundant - have functions or methods that just take parameters for and return the result set of a single SQL query, and figure out which function to call and how valid the results are elsewhere:

    function getResultsForUser($DB, $user_id)
    {
      $query="search for user :user_id, SQL pertaining to favourites";   
      $stmt=$DB->prepare($query);
      $stmt->execute(["user_id"=>$user_id]);
      return $stmt->fetchAll(PDO::FETCH_ASSOC);
    }

    function getResultsForVisitor($DB)
    {
        $query="blah";
        $stmt=$DB->prepare($query);
        $stmt->execute();
        return $stmt->fetchAll(PDO::FETCH_ASSOC);
    }

Yes, it does mean more code, but that code will be much cleaner, more readable, more easily transportable (by not depending on a third party library) and less bug-prone.

PavlovsCat · on June 26, 2016

Deciding between user or not is the most simple case though. The most monstrous function I have by far considers things such as

    - is the user logged in, and maybe even owner of the profile we're at?
    - are we looking at all nodes, or the subnodes/subtree (don't ask) of a node?
    - do we want to display tags/authors/sources?
    - are we filtering by tag/author?
    - what's the display mode: list or full?

There might be more too, that's from the top of my head... Yes, it's messy, but also kina DRY. To "unroll" that would be even more monstrous, that can't be the way D:

toast0 · on June 26, 2016

This actually feels a lot like localization -- unrolling all the ifs is actually the best way to do it. It feels like you're repeating yourself, but having the whole sentence / query together as one unit gives you the ability to understand the whole context -- often times in a database context, you can omit parts that are useless if you get the full context.

PavlovsCat · on June 27, 2016

Unrolling would mean a lot of permutations.. in my example 576 if I'm not mistaken. Understanding the context or dead SQL code isn't really a problem in my case, since the various query bits are mostly orthogonal to each other, it's that they're optional that makes it a bit complex.

But as I said in my other comment, I realized I can just remember the parameters/values as I build the query string, and then bind them all after creating the query. That way I can have my messy cake (that should feel so wrong but tastes so right) and eat it with a plate and a fork like a proper person.

prodigal_erik · on June 26, 2016

Queries with and without user_id probably have different plans (e.g., table scan vs. index lookup) and need to be prepared separately. I've never seen an ORM that takes prepared statements and query planning seriously, generally they only emit trivial queries.

PavlovsCat · on June 26, 2016

Since the queries only get re-used per connection IIRC, that's moot for my case (personal derpy CMS) anyway, since they all just get used once per request, anything loopish is admin-related and fast enough anyway.

What really bugged me though was the idea of having to repeat the decision logic that determines what the query ends up being twice... though I just realized that putting name/value pairs into an array while building the query, and then using that when the query is built to bind them to it, is probably fine.