克隆or亲戚? 理解相似安卓应用的根源
无需注册登录,支付后按照提示操作即可获取该资料.
理解相似安卓应用的根源(中文6000字,英文3000字)
Yuta Ishii Takuya Watanabe
早稻田大学 早稻田大学
摘要
自从重新包装一个安卓应用已不再是难题,就出现了很多克隆应用,这就是我们这个领域所称之为的“克隆”。正如先前研究所报道的 ,克隆被恶意用户用于不良目的,例如:添加恶意功能,注入或更换广告模块以及盗版。除了这样的克隆之外,也有一些合法的,类似的应用程序,在我们领域称之为“亲戚”。这些亲戚不是克隆,但在本质上是类似的,即,他们是由同一应用构建服务或由相同的开发人员使用相同的模板。鉴于这些观测,本文旨在回答以下两个研究问题:(RQ1)我们如何区分克隆和亲戚?(RQ2)什么是官方和第三方交易市场的克隆和亲戚的崩溃?回答第一个研究问题,我们开发了一个可扩展的框架称之为应用集中器,它能系统地提取相似应用并且将他们分为克隆和亲戚。我们注意到,我们的关键算法,数据的杠杆作用稀疏,在实践中有O(n)的时间复杂度。回答第二个研究问题,我们将集中器框架应用到超过一千三百万来自官方和第三方市场的地收集到的应用程序。我们的分析结果表明:在官方市场,79%相似的应用程序是亲戚,在第三方市场上,50%的类似应用归因于克隆。大多数亲戚应用都是由在两个市场的多产的开发人员所开发。我们还发现,在第三方市场,即最初发表在官方市场的克隆,其中76%是恶意软件。据我们所知,这是澄清“类似的”Android应用程序的崩溃,并利用庞大的数据集等同于官方市场规模的量化它们的起源所做的第一项工作。
关键词
移动安全、安卓、再装配、海量数据
ABSTRACT
Since it is not hard to repackage an Android app, there are many cloned apps, which we call “clones” in this work. As previous studies have reported, clones are generated for bad purposes by malicious parties, e.g., adding malicious functions, injecting/replacing advertising modules, and piracy. Besides such clones, there are legitimate, similar apps, which we call “relatives” in this work.These relatives are not clones but are similar in nature; i.e., they are generated by the same app-building service or by the same developer using a same template. Given these observations, this paper aims to answer the following two research questions: (RQ1) How can we distinguish between clones and relatives? (RQ2) What is the breakdown of clones and relatives in the official and third-party marketplaces? To answer the first research question, we developed a scalable framework called APPraiser that systematically extracts similar apps and classifies them into clones and relatives. We note that our key algorithms, which leverage sparseness of the data, have the time complexity of O(n) in practice. To answer the second research question, we applied the APPraiser framework to the over 1.3 millions of apps collected from official and third-party market-places. Our analysis revealed the following findings: In the official marketplace, 79% of similar apps were attributed to relatives while,
in the third-party marketplace, 50% of similar apps were attributed to clones. The majority of relatives are apps developed by prolific developers in both marketplaces. We also found that in the third-party market, of the clones that were originally published in the official market, 76% of them are malware. To the best of our
knowledge, this is the first work that clarified the breakdown of “similar” Android apps, and quantified their origins using a huge dataset equivalent to the size of official market.