在R中产生意外结果的合并
【腾讯云】亏本大甩卖,服务器4核16G 1年370元(带宽12M,系统盘120GB SSD盘,月流量2000GB)!!!!!!
云产品 配置 价格
服务器 1核2G,带宽5M,系统盘50GB SSD盘,月流量500GB 38元/年
MySQL 1核1G 19元/年
服务器 16核32G,带宽18M,系统盘250GB SSD盘,月流量5000GB 1197元/年
点我进入腾讯云,查看更多详情

I am trying to merge:

to_graph <- structure(list(Teacher = c("BS", "BS", "FA"
), Level = structure(c(2L, 1L, 1L), .Label = c("BE", "AE", "ME", 
"EE"), class = "factor"), Count = c(2L, 25L, 28L)), .Names = c("Teacher", 
"Level", "Count"), row.names = c(NA, 3L), class = "data.frame")

and

graph_avg <- structure(list(Teacher = structure(c(1L, 1L, 2L), .Label = c("BS", 
"FA"), class = "factor"), Count.Fraction = c(0.0740740740740741, 
0.925925925925926, 1)), .Names = c("Teacher", "Count.Fraction"
), row.names = c(NA, -3L), class = "data.frame")

with merge(to_graph, graph_avg, by="Teacher"), but instead of getting what I expect (3 rows), I get:

  Teacher Level Count Count.Fraction
1      BS    AE     2     0.07407407
2      BS    AE     2     0.92592593
3      BS    BE    25     0.07407407
4      BS    BE    25     0.92592593
5      FA    BE    28     1.00000000

Any ideas? Thank you!

#0

Not sure what you're trying to accomplish. merge is doing what it's supposed to here.

Let's look at all of the data.frames

graph_avg
  Teacher Count.Fraction
1      BS     0.07407407
2      BS     0.92592593
3      FA     1.00000000

to_graph
  Teacher Level Count
1      BS    AE     2
2      BS    BE    25
3      FA    BE    28

merge(to_graph, graph_avg)
  Teacher Level Count Count.Fraction
1      BS    AE     2     0.07407407
2      BS    AE     2     0.92592593
3      BS    BE    25     0.07407407
4      BS    BE    25     0.92592593
5      FA    BE    28     1.00000000

Now, if I'm going to merge those I've got to look and see what's common and what I'm going to get for an outcome. Teacher, you have that in both. But, if I try to merge on just Teacher what do I do? There's no unique identifier for BS and it appears twice in both data.frames. If it appeared once in one of them it would be easy to solve. So, I go can check and say, OK, I've got a unique identifier in one data.frame, level... that would do it... and go and make something that doesn't lose any of your data. merge is really handy for situations where you've got a small data.frame, say with each teacher in it once, and it has the teacher's age, or sex there. You could merge that into your another data.frame with repeated measures on teacher and every time the teacher appears you'll also know those. But for what you're doing it's not the right tool.

merge is not what you want here. If these are really your data.frames use cbind instead.

cbind(to_graph, graph_avg$Count.Fraction)

  Teacher Level Count Count.Fraction
1      BS    AE     2     0.07407407
2      BS    BE    25     0.92592593
3      FA    BE    28     1.00000000

That's probably what you were looking for.

#1

Since it's quite obvious that one of your datasets is derived from the other, I would suggest you don't need a merge at all, but find a way of doing the analysis in such a way that all of the data remains intact.

For example, use ddply in package plyr to derive one set from the other. Note how this result contains all of the information you need:

> library(plyr)
> ddply(to_graph, .(Teacher), transform, Count.Fraction=Count/sum(Count))

  Teacher Level Count Count.Fraction
1      BS    AE     2     0.07407407
2      BS    BE    25     0.92592593
3      FA    BE    28     1.00000000

To answer your question about merge. A merge in R is similar to a database join. To join two tables, you need to be certain that you can match the primary key in both tables. The primary key in your case is the combination of Teacher and Level. Since the Level column doesn't exist in your second data.frame, a merge is impossible.

The only way to recover this situation is to add the missing bit of the primary key back to the data. Assuming that the data is sorted in exactly the same order, you can do this with cbind and then do the merge:

> merge(to_graph, cbind(graph_avg, Level=to_graph$Level))
  Teacher Level Count Count.Fraction
1      BS    AE     2     0.07407407
2      BS    BE    25     0.92592593
3      FA    BE    28     1.00000000

推荐文章

如何使用Nokogiri访问属性

如何使用Nokogiri访问属性

推荐文章

如何让AuthLogic跳过密码验证?

如何让AuthLogic跳过密码验证?

推荐文章

jsch.JSchException:身份验证取消

jsch.JSchException:身份验证取消

推荐文章

在(emacs)lisp中提取/切片/重新排序列表?

在(emacs)lisp中提取/切片/重新排序列表?

推荐文章

如何知道在数据应用程序块中激发的确切语句?

如何知道在数据应用程序块中激发的确切语句?

推荐文章

为什么大多数java.util.Date方法都被弃用?

为什么大多数java.util.Date方法都被弃用?

推荐文章

延迟加载Javascript文件与之前放置Javascript文件之间的任何区别

延迟加载Javascript文件与之前放置Javascript文件之间的任何区别

推荐文章

NHibernate、并行框架和SQL Server

NHibernate、并行框架和SQL Server

推荐文章

将jQuery Growl与PHP和MySQL结合使用

将jQuery Growl与PHP和MySQL结合使用

推荐文章

你对这些汇编助记符有什么建议吗?

你对这些汇编助记符有什么建议吗?

推荐文章

编译需要管理员的单击一次应用程序?

编译需要管理员的单击一次应用程序?

推荐文章

如何从web自动检索CSV文件,将其保存在目录中,并在C#中访问它?

如何从web自动检索CSV文件,将其保存在目录中,并在C#中访问它?

推荐文章

如果我有一个Windows钩子,我需要哪个函数来停止接收数据包?

如果我有一个Windows钩子,我需要哪个函数来停止接收数据包?

推荐文章

jQuery:Div元素没有出现

jQuery:Div元素没有出现

推荐文章

有效的HTTP头?`获取/page.html Http1.0`?

有效的HTTP头?`获取/page.html Http1.0`?

推荐文章

如何使用Wicket对页面进行密码保护?

如何使用Wicket对页面进行密码保护?