《hadoop进阶》PeopleRank从社交关系中挖掘价值用户
发布时间:2021-03-11 05:28:28 所属栏目:大数据 来源:网络整理
导读:转载请注明出处: 转载自? Thinkgamer的CSDN博客: blog.csdn.net/gamer_gyt 代码下载地址:点击查看 1:PageRank 与 PeopleRank 2:需求分析:挖掘CSDN博客的价值用户 3:算法模型:PeopleRank算法 4:架构设计:从数据准备到PR算法的MR化 5:程序开发:had
下面只对部分代码进行展示,更多请前往github下载:点击查看 dataEtl.java package pagerankjisuan; import java.io.BufferedReader; import java.io.File; import java.io.FileNotFoundException; import java.io.FileReader; import java.io.FileWriter; import java.io.IOException; public class dataEtl { public static void main() throws IOException { File f1 = new File("MyItems/pagerankjisuan/people.csv"); if(f1.isFile()){ f1.delete(); } File f = new File("MyItems/pagerankjisuan/peoplerank.txt"); if(f.isFile()){ f.delete(); } //打开文件 File file = new File("MyItems/pagerankjisuan/day7_author100_mess.csv"); //定义一个文件指针 BufferedReader reader = new BufferedReader(new FileReader(file)); try { String line=null; //判断读取的一行是否为空 while( (line=reader.readLine()) != null) { String[] userMess = line.split( "," ); //第一字段为id,第是个字段为粉丝列表 String userid = userMess[0]; if(userMess.length!=0){ if(userMess.length==11) { int i=0; String[] focusName = userMess[10].split("|"); // | 为转义符 for (i=1;i < focusName.length; i++) { write(userid,focusName[i]); // System.out.println(userid+ " " + focusName[i]); } } else { int j =0; String[] focusName = userMess[9].split("|"); // | 为转义符 for (j=1;j < focusName.length; j++) { write(userid,focusName[j]); // System.out.println(userid+ " " + focusName[j]); } } } } } catch (FileNotFoundException e) { // TODO Auto-generated catch block e.printStackTrace(); } finally { reader.close(); //etl peoplerank.txt for(int i=1;i<=100;i++){ FileWriter writer = new FileWriter("MyItems/pagerankjisuan/peoplerank.txt",true); writer.write(i + "t" + 1 + "n"); writer.close(); } } System.out.println("OK.................."); } private static void write(String userid,String nameid) { // TODO Auto-generated method stub //定义写文件,按行写入 try { if(!nameid.contains("n")){ FileWriter writer = new FileWriter("MyItems/pagerankjisuan/people.csv",true); writer.write(userid + "," + nameid + "n"); writer.close(); } } catch (IOException e) { // TODO Auto-generated catch block e.printStackTrace(); } } } prjob.java (编辑:青岛站长网) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |
站长推荐