01-13-2005, 10:44 AM | #1 (permalink) |
Insane
Location: Wales, UK, Europe, Earth, Milky Way, Universe
|
[SQL] Usage of DISTINCT
Just a quick one really...
If i have the statement: Code:
SELECT DISTINCT t1.column1, t2.column2 FROM myTable1 t1, myTable2 t2 --etc Background: I'm trying to select the last five posts from a forum database but i dont want to show more than one post from the same subforum but if all of the last 5 posts were by the same person on different subforums i still want to show them. I'll thank you later
__________________
There are only two industries that refer to their customers as "users". - Edward Tufte |
01-15-2005, 12:28 PM | #4 (permalink) |
Insane
Location: Wales, UK, Europe, Earth, Milky Way, Universe
|
Unfortunately the mysql server i'm using doesn't support subqueries, they havent updated the version of mysql to the most recent one yet.
__________________
There are only two industries that refer to their customers as "users". - Edward Tufte |
01-16-2005, 03:06 PM | #5 (permalink) |
Guest
|
Are you able to identify and select posts from their forum and timestamps?
If so you could use something like: Code:
select top 5 subf_id, max(datetime) from posts group by subf_id order by max(datetime) desc |
01-16-2005, 04:14 PM | #6 (permalink) |
Insane
Location: Wales, UK, Europe, Earth, Milky Way, Universe
|
Solved!! EDIT: NOT!
Well i originally posted a big thank you here but the SQL query i came up with just seemed to work because the last 5 posts in my forum were all from different topics. Here's what i'm dealing with now:
Code:
SELECT p.post_id, p.post_topic, max( p.post_time ) AS PTIME, u.user_name, u.user_id, t.topic_title, t.topic_id, t.topic_replies, t.topic_views, f.forum_name, f.forum_id FROM wb_posts p LEFT JOIN wb_topics t ON p.post_topic = t.topic_id LEFT JOIN wb_forums f ON t.topic_forum = f.forum_id LEFT JOIN wb_users u ON p.post_author = u.user_id GROUP BY p.post_topic ORDER BY PTIME DESC LIMIT 0 , 5 Its getting quite frustrating and i think i may have to use two queries and a bit of php to kludge them together which i'm not too keen on. Oh, and that "top 5" statement doesnt work with this mysql server either
__________________
There are only two industries that refer to their customers as "users". - Edward Tufte Last edited by welshbyte; 01-16-2005 at 06:02 PM.. Reason: Totally wrong |
01-18-2005, 03:53 AM | #7 (permalink) | |
Upright
Location: Berkeley, CA
|
Quote:
1. Your query groups by p.post_topic, yet you are still selecting from other columns that are not aggregated (with max, min, etc.). This seems like a SQL syntax violation -- all columns in the SELECT clause that are not aggregated should appear in the GROUP BY clause. 2. LEFT JOINs can be expensive, and if you can get away without having to use them, your query will run faster. So, in this case, if p.post_topic is NEVER NULL, then the LEFT JOIN with t is wasted and should be converted into an INNER JOIN. Similar cases with the other LEFT JOINs. |
|
01-18-2005, 02:33 PM | #8 (permalink) |
Insane
Location: Wales, UK, Europe, Earth, Milky Way, Universe
|
1. The code doesnt yield any error messages
2. Using inner joins (using "where" to join the tables) gives the same result set
__________________
There are only two industries that refer to their customers as "users". - Edward Tufte |
01-24-2005, 03:26 AM | #9 (permalink) | |
Tilted
Location: Boston
|
Quote:
Code:
SELECT DISTINCT ON(column1[, column2...]) * FROM whatever DISTINCT is almost never the Right Way to do something. Your SQL looks expensive, in similar situations I would tend to have something like: Code:
SELECT sp.post_id, t.whatever, f.whatever, u.whatever FROM ( SELECT MAX(p.post_id) FROM posts AS p JOIN topics USING (p.topic_id) GROUP BY p.topic_id ORDER BY MAX(p.post_id) DESC LIMIT 5 ) AS sp JOIN topics t USING(topic_id) JOIN forums f USING (forum_id) JOIN users u USING (user_id) If your database doesn't support that construct (whatever they are called - dynamic tables? I think), then you probably want two queries: get the post_id's with the first and stick them in an IN() in the second. Unless you are running Oracle, the cost of the extra query is completely negligible (and even with Oracle it probably won't be noticeable), just prepare both queries in advance. Of course if performance does become an issue (ie huge forum and this is a very frequent query), you can think about denormalizing and putting the forum_id into the posts table; but of course lots of people have lots of strong (and divergent) opinions about the merits of doing that. Anyway, I hope any of this helped at all, I realize it's not exactly specific to your situation. |
|
Tags |
distinct, sql, usage |
|
|