Tilted Forum Project Discussion Community  

Go Back   Tilted Forum Project Discussion Community > Interests > Tilted Technology


 
 
LinkBack Thread Tools
Old 01-13-2005, 10:44 AM   #1 (permalink)
Insane
 
Location: Wales, UK, Europe, Earth, Milky Way, Universe
[SQL] Usage of DISTINCT

Just a quick one really...

If i have the statement:
Code:
SELECT DISTINCT t1.column1, t2.column2 
FROM myTable1 t1, myTable2 t2

--etc
Are t1.column1 and t2.column2 both made to be distinct or would there be repetitions in column2? If not, how would i go about making t1.column1 distinct but still have repetitions in t2.column2?

Background: I'm trying to select the last five posts from a forum database but i dont want to show more than one post from the same subforum but if all of the last 5 posts were by the same person on different subforums i still want to show them.

I'll thank you later
__________________
There are only two industries that refer to their customers as "users". - Edward Tufte
welshbyte is offline  
Old 01-13-2005, 11:45 AM   #2 (permalink)
zen_tom
Guest
 
you could try using a grouped sub-query to give you the last post for each subforum, ordered by date - and select the top 5 off the list.
 
Old 01-13-2005, 11:50 AM   #3 (permalink)
zen_tom
Guest
 
Something like

Code:
select
subf_id, post_id from
(select top 5 
subf_id, 
post_id, 
max(datetime) as dt 
from item 
group by 
subf_id, 
post_id
order by dt desc) as list
Not sure if your db platform will support this kind of subquery, but it's worth a shot.
 
Old 01-15-2005, 12:28 PM   #4 (permalink)
Insane
 
Location: Wales, UK, Europe, Earth, Milky Way, Universe
Unfortunately the mysql server i'm using doesn't support subqueries, they havent updated the version of mysql to the most recent one yet.
__________________
There are only two industries that refer to their customers as "users". - Edward Tufte
welshbyte is offline  
Old 01-16-2005, 03:06 PM   #5 (permalink)
zen_tom
Guest
 
Are you able to identify and select posts from their forum and timestamps?
If so you could use something like:

Code:
select top 5
subf_id, max(datetime) from posts
group by subf_id
order by max(datetime) desc
And then reference each post by it's sub_form_id and datetime stamp - this way you'll get the 5 most recent postings for each distinct sub_forum
 
Old 01-16-2005, 04:14 PM   #6 (permalink)
Insane
 
Location: Wales, UK, Europe, Earth, Milky Way, Universe
Solved!! EDIT: NOT!

Well i originally posted a big thank you here but the SQL query i came up with just seemed to work because the last 5 posts in my forum were all from different topics. Here's what i'm dealing with now:
Code:
SELECT
    p.post_id, p.post_topic, max( p.post_time ) AS PTIME,
    u.user_name, u.user_id,
    t.topic_title, t.topic_id, t.topic_replies, t.topic_views,
    f.forum_name, f.forum_id
FROM
    wb_posts p
LEFT JOIN
    wb_topics t ON p.post_topic = t.topic_id
LEFT JOIN
    wb_forums f ON t.topic_forum = f.forum_id
LEFT JOIN
    wb_users u ON p.post_author = u.user_id
GROUP BY
    p.post_topic
ORDER BY
    PTIME DESC
LIMIT
    0 , 5
I find that part of this query works in getting the last 5 topics correctly but if , say, the original poster was also the last poster, it would select the original post of the topic.

Its getting quite frustrating and i think i may have to use two queries and a bit of php to kludge them together which i'm not too keen on.

Oh, and that "top 5" statement doesnt work with this mysql server either
__________________
There are only two industries that refer to their customers as "users". - Edward Tufte

Last edited by welshbyte; 01-16-2005 at 06:02 PM.. Reason: Totally wrong
welshbyte is offline  
Old 01-18-2005, 03:53 AM   #7 (permalink)
Upright
 
Location: Berkeley, CA
Quote:
Originally Posted by welshbyte
Well i originally posted a big thank you here but the SQL query i came up with just seemed to work because the last 5 posts in my forum were all from different topics. Here's what i'm dealing with now:
...
A couple things I noticed that puzzled me:
1. Your query groups by p.post_topic, yet you are still selecting from other columns that are not aggregated (with max, min, etc.). This seems like a SQL syntax violation -- all columns in the SELECT clause that are not aggregated should appear in the GROUP BY clause.
2. LEFT JOINs can be expensive, and if you can get away without having to use them, your query will run faster. So, in this case, if p.post_topic is NEVER NULL, then the LEFT JOIN with t is wasted and should be converted into an INNER JOIN. Similar cases with the other LEFT JOINs.
littlebighead is offline  
Old 01-18-2005, 02:33 PM   #8 (permalink)
Insane
 
Location: Wales, UK, Europe, Earth, Milky Way, Universe
1. The code doesnt yield any error messages
2. Using inner joins (using "where" to join the tables) gives the same result set
__________________
There are only two industries that refer to their customers as "users". - Edward Tufte
welshbyte is offline  
Old 01-24-2005, 03:26 AM   #9 (permalink)
Tilted
 
Location: Boston
Quote:
Originally Posted by welshbyte
Are t1.column1 and t2.column2 both made to be distinct or would there be repetitions in column2?
Not sure what database you are using, I live in Postgres land, so some things will probably be different. With Postgres you can do a
Code:
SELECT DISTINCT ON(column1[, column2...]) * FROM whatever
The scary part here is that unless you are careful with your ORDER BY's, what you get for the columns that aren't DISTINCTed can be undefined and different from query to query.

DISTINCT is almost never the Right Way to do something.

Your SQL looks expensive, in similar situations I would tend to have something like:

Code:
SELECT sp.post_id, t.whatever, f.whatever, u.whatever
FROM (
	SELECT MAX(p.post_id)
	FROM posts AS p 
	JOIN topics USING (p.topic_id)
	GROUP BY p.topic_id 
	ORDER BY MAX(p.post_id) DESC LIMIT 5
) AS sp
JOIN topics t USING(topic_id)
JOIN forums f USING (forum_id)
JOIN users u USING (user_id)
Of course I am assuming that your post_id's are incrementing. The execute trees for both of these may very well be similar, but this way I feel I can have a better guess at what the optimizer will do. There seems to be little reason to make those JOINs 'LEFT', but then that three join combination (post -> user; post -> thread -> forum) looks common enough that you probably have it wrapped in a VIEW.

If your database doesn't support that construct (whatever they are called - dynamic tables? I think), then you probably want two queries: get the post_id's with the first and stick them in an IN() in the second. Unless you are running Oracle, the cost of the extra query is completely negligible (and even with Oracle it probably won't be noticeable), just prepare both queries in advance.

Of course if performance does become an issue (ie huge forum and this is a very frequent query), you can think about denormalizing and putting the forum_id into the posts table; but of course lots of people have lots of strong (and divergent) opinions about the merits of doing that.

Anyway, I hope any of this helped at all, I realize it's not exactly specific to your situation.
franzelneekburm is offline  
 

Tags
distinct, sql, usage


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT -8. The time now is 12:00 AM.

Tilted Forum Project

Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2024, vBulletin Solutions, Inc.
Search Engine Optimization by vBSEO 3.6.0 PL2
© 2002-2012 Tilted Forum Project

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360