![Page 1: Mining Bulletin Board Systems Using Community Generation Ming Li, Zhongfei (Mark) Zhang, and Zhi-Hua Zhou PAKDD’08 Reporter: Che-Wei, Liang Date: 2008.07.10](https://reader036.vdocuments.site/reader036/viewer/2022062720/56649f045503460f94c1787c/html5/thumbnails/1.jpg)
Mining Bulletin Board Systems Using Community Generation
Ming Li, Zhongfei (Mark) Zhang, and Zhi-Hua ZhouPAKDD’08
Reporter: Che-Wei, LiangDate: 2008.07.10
1
![Page 2: Mining Bulletin Board Systems Using Community Generation Ming Li, Zhongfei (Mark) Zhang, and Zhi-Hua Zhou PAKDD’08 Reporter: Che-Wei, Liang Date: 2008.07.10](https://reader036.vdocuments.site/reader036/viewer/2022062720/56649f045503460f94c1787c/html5/thumbnails/2.jpg)
Outline
• Introduction• General Model• Interest-Sharing Group Identification• Predicting User Behavior Using Generated
Community• Experiment
2
![Page 3: Mining Bulletin Board Systems Using Community Generation Ming Li, Zhongfei (Mark) Zhang, and Zhi-Hua Zhou PAKDD’08 Reporter: Che-Wei, Liang Date: 2008.07.10](https://reader036.vdocuments.site/reader036/viewer/2022062720/56649f045503460f94c1787c/html5/thumbnails/3.jpg)
Introduction
• Bulletin Board System (BBS)– Information exchanging and sharing platform– Consists of a number of boards– Users can read/post messages on different topics
• Users with similar interests may have similar actions
• Effective discovery of relationships between users of a BBS is essential
3
![Page 4: Mining Bulletin Board Systems Using Community Generation Ming Li, Zhongfei (Mark) Zhang, and Zhi-Hua Zhou PAKDD’08 Reporter: Che-Wei, Liang Date: 2008.07.10](https://reader036.vdocuments.site/reader036/viewer/2022062720/56649f045503460f94c1787c/html5/thumbnails/4.jpg)
4
![Page 5: Mining Bulletin Board Systems Using Community Generation Ming Li, Zhongfei (Mark) Zhang, and Zhi-Hua Zhou PAKDD’08 Reporter: Che-Wei, Liang Date: 2008.07.10](https://reader036.vdocuments.site/reader036/viewer/2022062720/56649f045503460f94c1787c/html5/thumbnails/5.jpg)
General Model
• Consider the posted messages,– Use title to fully determine the topics of message– Extracted key words of titles – Mapped to collected topics
• A BBS user tends to join in a discussion on topics that he or she is interested– Messages that users posted may reflect users’ interests– Users’ interests are time-dependent– Frequency of messages posted should also be assessed
5
![Page 6: Mining Bulletin Board Systems Using Community Generation Ming Li, Zhongfei (Mark) Zhang, and Zhi-Hua Zhou PAKDD’08 Reporter: Che-Wei, Liang Date: 2008.07.10](https://reader036.vdocuments.site/reader036/viewer/2022062720/56649f045503460f94c1787c/html5/thumbnails/6.jpg)
General Model
• Access pattern of BBS users– View of Topics• A set of topics and user access frequencies of the
messages posted to different boards by different users along the timeline
– View of Boards• A set of boards and frequencies of messages posted to
the boards along the timeline
6
![Page 7: Mining Bulletin Board Systems Using Community Generation Ming Li, Zhongfei (Mark) Zhang, and Zhi-Hua Zhou PAKDD’08 Reporter: Che-Wei, Liang Date: 2008.07.10](https://reader036.vdocuments.site/reader036/viewer/2022062720/56649f045503460f94c1787c/html5/thumbnails/7.jpg)
General Model
• BBS model– A collection of users, each being represented by
two timelines of actions on Boards view and Topics view
7
![Page 8: Mining Bulletin Board Systems Using Community Generation Ming Li, Zhongfei (Mark) Zhang, and Zhi-Hua Zhou PAKDD’08 Reporter: Che-Wei, Liang Date: 2008.07.10](https://reader036.vdocuments.site/reader036/viewer/2022062720/56649f045503460f94c1787c/html5/thumbnails/8.jpg)
Interest-Sharing Group Identification
8
![Page 9: Mining Bulletin Board Systems Using Community Generation Ming Li, Zhongfei (Mark) Zhang, and Zhi-Hua Zhou PAKDD’08 Reporter: Che-Wei, Liang Date: 2008.07.10](https://reader036.vdocuments.site/reader036/viewer/2022062720/56649f045503460f94c1787c/html5/thumbnails/9.jpg)
Interest-Sharing Group Identification
• Given two timelines of actions X and Y of two users idx and idy
• A Straight forward way – Similarity between Xi and Yj =
9
![Page 10: Mining Bulletin Board Systems Using Community Generation Ming Li, Zhongfei (Mark) Zhang, and Zhi-Hua Zhou PAKDD’08 Reporter: Che-Wei, Liang Date: 2008.07.10](https://reader036.vdocuments.site/reader036/viewer/2022062720/56649f045503460f94c1787c/html5/thumbnails/10.jpg)
Interest-Sharing Group Identification
• Average frequency differences of actions
• Local similarity between Xi and Yj
10
![Page 11: Mining Bulletin Board Systems Using Community Generation Ming Li, Zhongfei (Mark) Zhang, and Zhi-Hua Zhou PAKDD’08 Reporter: Che-Wei, Liang Date: 2008.07.10](https://reader036.vdocuments.site/reader036/viewer/2022062720/56649f045503460f94c1787c/html5/thumbnails/11.jpg)
Interest-Sharing Group Identification
• Hybrid similarity between Xi and Y
• Global similarity between X and Y
11
![Page 12: Mining Bulletin Board Systems Using Community Generation Ming Li, Zhongfei (Mark) Zhang, and Zhi-Hua Zhou PAKDD’08 Reporter: Che-Wei, Liang Date: 2008.07.10](https://reader036.vdocuments.site/reader036/viewer/2022062720/56649f045503460f94c1787c/html5/thumbnails/12.jpg)
Predict User Behavior Using Generated Community
• Given a user idi, – Predict what action idi may take in the near future
• Actions that have been taken by idi may be closely related to idi’s future actions– Possible solution• Compute posterior probability
12
![Page 13: Mining Bulletin Board Systems Using Community Generation Ming Li, Zhongfei (Mark) Zhang, and Zhi-Hua Zhou PAKDD’08 Reporter: Che-Wei, Liang Date: 2008.07.10](https://reader036.vdocuments.site/reader036/viewer/2022062720/56649f045503460f94c1787c/html5/thumbnails/13.jpg)
Predict User Behavior Using Generated Community
• Resolved with interest-sharing groups– Similar users may take similar actions at some
time instants
13
![Page 14: Mining Bulletin Board Systems Using Community Generation Ming Li, Zhongfei (Mark) Zhang, and Zhi-Hua Zhou PAKDD’08 Reporter: Che-Wei, Liang Date: 2008.07.10](https://reader036.vdocuments.site/reader036/viewer/2022062720/56649f045503460f94c1787c/html5/thumbnails/14.jpg)
BPUC algorithm
14
![Page 15: Mining Bulletin Board Systems Using Community Generation Ming Li, Zhongfei (Mark) Zhang, and Zhi-Hua Zhou PAKDD’08 Reporter: Che-Wei, Liang Date: 2008.07.10](https://reader036.vdocuments.site/reader036/viewer/2022062720/56649f045503460f94c1787c/html5/thumbnails/15.jpg)
Experiment
• Data Set– BBS of Nanjing University– messages collected from January 1st, 2003 to
December 1st, 2005 on 17 most popular boards.– 4512 topics of 17 boards, 1109 users.
• Evaluation set – 42 volunteers, 18 users interested in modern
weapons, 12 users are fond of programming skills; rest of users are interested in computer games
15
![Page 16: Mining Bulletin Board Systems Using Community Generation Ming Li, Zhongfei (Mark) Zhang, and Zhi-Hua Zhou PAKDD’08 Reporter: Che-Wei, Liang Date: 2008.07.10](https://reader036.vdocuments.site/reader036/viewer/2022062720/56649f045503460f94c1787c/html5/thumbnails/16.jpg)
16
![Page 17: Mining Bulletin Board Systems Using Community Generation Ming Li, Zhongfei (Mark) Zhang, and Zhi-Hua Zhou PAKDD’08 Reporter: Che-Wei, Liang Date: 2008.07.10](https://reader036.vdocuments.site/reader036/viewer/2022062720/56649f045503460f94c1787c/html5/thumbnails/17.jpg)
Experiments on Community Generation
• Neighborhood accuracy– Describes how accurate the neighbors of a user in
a generated community share similar interests to that of the user
• Component accuracy– Measures how well these generated groups
represent certain interests that are common to the individuals of the groups
17
![Page 18: Mining Bulletin Board Systems Using Community Generation Ming Li, Zhongfei (Mark) Zhang, and Zhi-Hua Zhou PAKDD’08 Reporter: Che-Wei, Liang Date: 2008.07.10](https://reader036.vdocuments.site/reader036/viewer/2022062720/56649f045503460f94c1787c/html5/thumbnails/18.jpg)
Experiments on Community Generation
• Example– A generated community, 7 links between similar
users, 10 links between dissimilar users
– Neighborhood accuracy = (7+10)/21 = 0.810Component accuracy = (7+0)/21 = 0.333
18
![Page 19: Mining Bulletin Board Systems Using Community Generation Ming Li, Zhongfei (Mark) Zhang, and Zhi-Hua Zhou PAKDD’08 Reporter: Che-Wei, Liang Date: 2008.07.10](https://reader036.vdocuments.site/reader036/viewer/2022062720/56649f045503460f94c1787c/html5/thumbnails/19.jpg)
Experiments on Community Generation
• Compare with CORAL
19
![Page 20: Mining Bulletin Board Systems Using Community Generation Ming Li, Zhongfei (Mark) Zhang, and Zhi-Hua Zhou PAKDD’08 Reporter: Che-Wei, Liang Date: 2008.07.10](https://reader036.vdocuments.site/reader036/viewer/2022062720/56649f045503460f94c1787c/html5/thumbnails/20.jpg)
Experiments on Community Generation
20
![Page 21: Mining Bulletin Board Systems Using Community Generation Ming Li, Zhongfei (Mark) Zhang, and Zhi-Hua Zhou PAKDD’08 Reporter: Che-Wei, Liang Date: 2008.07.10](https://reader036.vdocuments.site/reader036/viewer/2022062720/56649f045503460f94c1787c/html5/thumbnails/21.jpg)
Experiments on Community Generation
• Running time comparison
21
![Page 22: Mining Bulletin Board Systems Using Community Generation Ming Li, Zhongfei (Mark) Zhang, and Zhi-Hua Zhou PAKDD’08 Reporter: Che-Wei, Liang Date: 2008.07.10](https://reader036.vdocuments.site/reader036/viewer/2022062720/56649f045503460f94c1787c/html5/thumbnails/22.jpg)
Experiments on User Behavior Prediction
• 1056 days for training the probability model• Last 10 days for testing
22