patpat.com品牌有谁熟悉?能给点参考不?

如何让美国辣妈主动为中国母婴产品掏腰包?|海尔社区
请使用Internet Explorer 10以上浏览器,或Chrome、Firefox、Safari访问本站,以获得最佳浏览体验。
专做 “歪果仁” 的生意,买完动画片卖母婴产品我在甲骨文工作多年,一直以来都做大数据相关的工作,所以我在大数据平台搭建和数据整合方面经验比较丰富,PatPat 的大数据平台可以说是我抽空一个晚上搭出来的。但对我做 PatPat 影响更大的是 5 年前我做动画片生意的经历。2011 年,我从甲骨文暂时出来,帮家里开的公司把国产动画片卖到国外。我的合伙人高灿比我更早涉足跨境电商行业,早在 2008 年,他就把温州的鞋弄到美国去卖。可
能出乎很多人意料,许多国产动画片在国外卖得挺好。当时我们卖的动画片有 150
多部,每一部都有海外发行的版权,一年的总销售额是一千多万人民币。在美国有市场的中国动画片主要是充满正能量且符合美国价值观的,比如《十二生肖》,还
有一个叫《毛毛王》的广东动画片,讲的是孙悟空曾孙的故事。相反,《喜羊羊》这部在中国热播的动画片,在美国不太受欢迎,里面暴力太多,在美国属于 R
级别的动画片。做动画片生意时,我意识到文化产品特别是影视行业,尤其讲究品牌与形象。当时中国动画片基本就没有像样的字幕翻译,有的也都是雷人的 Chinglish,常常闹笑话。于是,我们建立了两层翻译体系:先雇佣中国人把中文剧本翻译成较为正规的英文,再请国外纯英文团队(当时找的新加坡团队)把正规英文润色,我们也会根据欧美的一些文化习惯作调整。在海报、剪辑、介绍等方面,我们参考了大量国外同类型动画片,在风格细节上都做了欧美化。与此同时,我们还在展会的视听中心做广告,并参加中国人都不去的卖家晚会。所有这些都让动画片看上去是国际品牌,不过,那时大部分中国影视制作人都在中国展区闲坐着。与动画行业类似,通过 PatPat 平台把中国生产的童装、尿不湿等产品卖到国外,也绕不开品牌塑造。不碰那几千亿的灰色地带互联网的最终目标是消灭信息不对称,电商恰巧是在解决某种信息不对称。美国有市场,有品牌和渠道,中国厂家有制造能力,为什么不往外走?我发现,最近这 5 到 10 年,中国有大量的厂家不仅仅是做 OEM,可能都是 ODEM。出口跨境电商已经成为大势所趋。出口跨境电商的产业链很长,原有商业模式有很多中间商,包括出口商、进口商、批发商,零售商。最近几年,很多野蛮生长的跨境电商粗暴地把中间商都剪断了。可
以说,中国 80% 到 90% 的出口跨境电商,既没有符合中国的出口标准,也没有符合美国的进口标准,实际上就是三无产品,也就是俗称的
“烂货”。烂货是比假货更糟糕的产品,质量过关的假货其实是可以接受的,烂货则可能要人命,比如用了含铅的材料或者含有放射性材料。这些商品通过国际快递到美国,不用过任何关卡,美国海关也只是抽检,混过去也就混过去了。这是一个灰色地带——一个每年几千亿的灰色地带。我始终认为,这个灰色地带不可能长久存在,因为美国人不可能不会注意到,中国也不可能不去监管。此外,长远看,这个灰色地带会给厂家的品牌甚至中国制造带来非常坏的影响。我也一直在思考剪断中间商的合理性。在我看来,他们存在那么多年必然也有部分合理性。我们可以把它们砍掉,但中间商所承担的职能和它们所做的事我们不能砍掉。举
例说,出口商的功能是检验产品是否符合中国出口标准。进口商也要做很多检测工作,因为欧美国家对进口商品有很严格的要求。比如说,美国玩具有 ASTM
标准、儿童衣服阻燃测试性标准,以及含铅标准。而且,所有给小孩用的产品都得有编号和正确的标识。过小的小孩用品必须要有误吞食的标识。这些是美国法律要
求的,不能马虎。PatPat
之所以选择母婴和玩具市场(其实更偏母婴)的一个重要理由是,与母婴和玩具相关的标准非常多,行业准入门槛高,容易形成壁垒。不仅如此,从商业角度来考
虑,母婴产品重复购买率比较高。对移动电商而言,复购率的重要性毋庸置疑。我和我的合伙人都有小孩,常常给小孩买东西,明显感觉到这个领域空间很大。功不可没的美国妈妈团样品跟货是两码事,这个即便在国际贸易上也常有。为了防止烂货,PatPat 做了大量工作来。我们坚持自己来做选品和质检,因为我们需要对平台上的商品负责。我们的选品标准第一条是不能有烂货。我们按照沃尔玛、Tesco 等跨国公司的标准,对每个上架货品都进行严格的美标质量验证。我们的质检团队都是从沃尔玛出来的。鉴于沃尔玛在美国是卖 “便宜货” 的,我们的质检标准设得比沃尔玛高一点。中国供应商滥用美国 IP 的问题也很普遍,那些打着迪斯尼商标的产品数不胜数。因此,选品时,我们也要防假货、防冒牌。在这方面,教育供应商是个很费劲的事。我们得反复跟他们讲两个概念,一个是合法的抄袭,另一个叫复制。举个例子,百度搜索和谷歌搜索功能类似,但这不会给百度带来任何麻烦,因为百度有本地化的实践。美国本土也存在大量抄袭,比如说乐高有个非常出名的 “抄袭的兄弟” 叫 Megablocks。完
全复制,百分百一样,则是不合规的,除非被复制的那家公司没有申请 IP(我们一般会查美国的 IP
数据库),且没有形成一个大众口碑。有时,我们看到一些童装商家,把国外的品牌直接复制过来,都要跟苦口婆心地告诉他们,“不要再干这种事了。你把上面的
字改一改,图片位置换一下,就不是复制而是合法抄袭。”我们把鉴别仿冒产品的工作交给了美国妈妈团。这是我们创建的一个 Facebook Group,里面有 200 多位妈妈,她们大多是我们的种子用户。她们是母婴电商网站和线下商超的常客,一款中国商品是否和美国本土的产品长得一模一样,她们有时一眼就能看出。除
了帮我们鉴别仿冒品,妈妈团还发现了很多其他问题。比如,她们中有人在美国消费者安全部门工作,看到某件商品存在安全隐患,就会问 “这产品有没有
CPC(Children’ s Product Certificate)?没有就需要赶快下架。”
此外,她们熟悉美国市场,会给我们许多采购建议,甚至关于产品定价的建议。妈妈团的思路和小米社区有相似之处,说白了就是参与感。她们承诺每个月至少帮 PatPat 做 3 件事。她们的成就感在于,是她们把品牌慢慢养起来的,品牌越做越好,有她们的功劳。作为妈妈,她们天然的有动力来做这件事:因为如果产品质量有问题,对婴儿和宝宝会有危害,她们的第一反应是 say no。由于妈妈团成员都是 PatPat 的忠实客户,我们用一些折扣和打折券来回馈她们。这对我们而言是赚钱的,毕竟她们拿着 PatPat 的券,最终还是在 PatPat 上消费。防假冒伪劣看起来链条很长也很费力气,但是我们用很少的人力就办到了,因为我们借助了类似妈妈团这样的外部力量。在其他方面,我们也使用了各种外部力量,特别是消费者的力量。这也是新兴的共享经济带给我们的启发。帮中国厂商打造品牌当中国厂商能生产好东西的时候,品牌能否立起来是另一个问题。换句话说,除了东西好,还要让产品看起来是牌子货。品牌可以体现在视觉感官上,有些中国外销玩具厂家提供的产品描述有严重的英文语法问题。我们组建了 5 个人的文案团队,其中 3 个在中国负责中译英,另两位美国同事把英文润色,让人一看到产品就觉得是美国人的东西。同样的商品,产品描述和说明书一改,逼格就上去了。除了优化文字,我们还找了专业摄影师对产品进行拍摄,让他们更像欧美风格。我们在美国有位选品师,他一天能看几百种产品,专看产品有没有国际范儿,如果没有,就不让上线。不仅如此,我们在纽约找了设计师为产品进行包装设计。就连发票我们都是用 120 克的纸。比如,中美两地干洗的规则是不一样的,我们会按照美国习惯和标准重新制作标识和吊牌。这些看上去是细节和面子功夫的东西,其实对塑造品牌很有帮助。产品质量好,包装精良,价格也很合适的时候,消费者就会觉得性价比高。2015 年 9 月份前后,PatPat 上线半年,很多顾客说完全看不出来我们是个跟中国沾边的 App。目前,所有 PatPat 平台的商品,都会都打 PatPat 的标牌。我们对供应商说,我巴不得你们打自己的牌子,但对很多中小型供应商而言,把自己的品牌做起来还有很长的路要走。渠
道本身也有品牌效果,我们很重视 PatPat 作为渠道的品牌价值。在 App
的设计上,我们花了大量心思,每个图标甚至每个像素我们都抠得很细。这大概是 PatPat App 被苹果 App Store 首页前三
Feature 的理由。去年8月 底 9月 初,打开苹果 App Store 首页,最上面的核心区域,就推荐了 PatPat。 一说就痛的供应链在美国,规矩大多摆在台面上,有些原则是不能碰的。比如我刚提到的各种规范和标准都是原则性问题,不能出错。美国人签了合同就照章办事,我们与美国好几十家供应商签订无货合同,从来没有出过一点问题。但是,有些中国供应商缺货、产能有问题时,还会跟平台签合同。样品没问题、真货有问题的情况也常有。我们在供应链上吃过亏。去年9月,我们遭遇了用户下订单货却发不出的情况。我们也挺头痛的,但是没办法,做这门生意就得接受这些供应商不好的地方,并陪它们一起改好。现在,我们强力要求中国供应商把货压在 PatPat 的中国仓,而且我们不只看样品,所有的货我们都要检验。相对滞后的物流服务,也是制约海外电商发展的重要原因。毕竟物流成本在整个价格体系里占比较大,通常超过 30%。具体而言,服装品类可能是 10 - 20%,母婴用品的物流成本在 20 - 30% 之间,玩具甚至高达 40 - 50%。在
我看来,物流最终还是要自己做,但是我们还没有到发展到这个阶段,目前仍然在用市面上的物流方案。到底跟哪家物流公司合作?这是像神农尝百草一样试出来
的。PatPat 试了很多不同方案,货量不一样,用的物流方案也会不同。举例说,量没大到一定程度,DHL 不会发你的货。另一个跨境电商平台需要考虑的问题是,选择和中小供应商还是大品牌合作?我的体会是,从中小供应商切入比较容易,PatPat 和中小供应商合作比较多。它们创新力足,能源源不断生产新奇的东西。大供应商一般分为两种,一种已经大到有自己的品牌和品牌商,第二种就是做 OEM,做 OEM 需要很大的单量,对我们这种平台不是很感兴趣。做大供应商生意的难点在于,价格高,门槛高。好处是大供应商有实力可以压货。谈供应商没有什么捷径可言。我们就是一家一家拿下的,公司创立初期,我会亲自去谈商家。在美国,家家都是跨境电商有一种说法,在美国家家都是跨境电商,因为每家都卖从中国来的商品。有一点想强调的是,PatPat 也从来不是只卖中国商品的。PatPat 的用户定位是年轻妈妈群体,在美国叫做千禧妈妈。这些八零九零后妈妈,比较喜欢新奇的玩意,爱买新奇的东西。与十六七岁的小年轻相比,她们有消费能力,但是她们没有四五十岁的人那么有钱,还想 “占点便宜”。这些千禧妈妈既需要中国品牌也买美国品牌。为了满足她们对新奇商品的需求,PatPat 平台选择上线的美国商品往往有某种稀缺性,大多是美国本土一些通过本地渠道销售的小品牌。从商业角度考虑,卖美国商品,也能拉动对中国商品的消费。我们团队现在有三十多个人,大部队在中国,在美国有六名员工。但我总说 PatPat 就是一家美国公司。在我看来,衡量一家公司到底是哪个国家的公司,首先要看它是按照哪个国家的标准做事。PatPat 就是按照美国标准在做事的。我们核心团队都是在美国学习工作了很长时间的华人,我们有丰富的欧美顶级公司(包括 Oracle, IBM, ShopKick, Vayama)工作经验,在本土化营销方面有优势,也能较好地把握美国本土消费者心理。但是,毫无疑问,我们做美国市场,必须要和许多美国移动电商 App 竞争。好在中国电商市场的竞争激烈程度不亚于美国,我们跟美国商超的打法有点中国电商的味道,与美国电商公司相比,我们特别重运营,执行力也更强。
本地上传支持jpg.gif.png, 单张图片不得超过5M
是否取消关注该产品圈?
给帖主赠送海贝
赠送的海贝数量:您拥有的可用海贝数量0
你要取消封禁此用户吗?
你要封禁此用户吗?
你要取消设置此文章为图片帖吗?
你要设置此文章为图片帖吗?
你要取消设置此文章为原创帖吗?
你要设置此文章为原创帖吗?
你要设置此帖标题为?
你确定要推荐此文章吗?
你确定要取消推荐此文章吗?
你确定要取消置顶此文章吗?
你确定要删除此文章吗?
你要设置此文章为精华帖吗?
你要取消此文章为精华帖吗?
只有登录以后才能支持哦~现在登录没有账号?注册一个
你已经赞过了,好勤奋哦!
申请失败,请稍后再试~&figure&&img src=&https://pic2.zhimg.com/v2-ca7e8c5a093b9bce4ac3cbd_b.jpg& data-rawwidth=&6294& data-rawheight=&1610& class=&origin_image zh-lightbox-thumb& width=&6294& data-original=&https://pic2.zhimg.com/v2-ca7e8c5a093b9bce4ac3cbd_r.jpg&&&/figure&&p&&b&码字不易,欢迎给个赞!&/b&&/p&&p&&b&欢迎交流与转载,文章会同步发布在公众号:机器学习算法全栈工程师(Jeemy110)&/b&&/p&&p&&br&&/p&&p&&b&历史文章:&/b&&/p&&a href=&https://zhuanlan.zhihu.com/p/& data-draft-node=&block& data-draft-type=&link-card& data-image=&https://pic1.zhimg.com/v2-afba8ee2b3ba658d1a1dc970cbx120.jpg& data-image-width=&1043& data-image-height=&559& class=&internal&&小白将:你必须要知道CNN模型:ResNet&/a&&hr&&h2&前言&/h2&&p&在计算机视觉领域,卷积神经网络(CNN)已经成为最主流的方法,比如最近的GoogLenet,VGG-19,Incepetion等模型。CNN史上的一个里程碑事件是ResNet模型的出现,ResNet可以训练出更深的CNN模型,从而实现更高的准确度。ResNet模型的核心是通过建立前面层与后面层之间的“短路连接”(shortcuts,skip connection),这有助于训练过程中梯度的反向传播,从而能训练出更深的CNN网络。今天我们要介绍的是DenseNet模型,它的基本思路与ResNet一致,但是它建立的是前面所有层与后面层的密集连接(dense connection),它的名称也是由此而来。DenseNet的另一大特色是通过特征在channel上的连接来实现特征重用(feature reuse)。这些特点让DenseNet在参数和计算成本更少的情形下实现比ResNet更优的性能,DenseNet也因此斩获CVPR 2017的最佳论文奖。本篇文章首先介绍DenseNet的原理以及网路架构,然后讲解DenseNet在Pytorch上的实现。&/p&&figure&&img src=&https://pic2.zhimg.com/v2-c81da515c8fa9796601fde82e4d36f61_b.jpg& data-caption=&& data-size=&normal& data-rawwidth=&985& data-rawheight=&674& class=&origin_image zh-lightbox-thumb& width=&985& data-original=&https://pic2.zhimg.com/v2-c81da515c8fa9796601fde82e4d36f61_r.jpg&&&/figure&&p&&br&&/p&&h2&设计理念&/h2&&p&相比ResNet,DenseNet提出了一个更激进的密集连接机制:即互相连接所有的层,具体来说就是每个层都会接受其前面所有层作为其额外的输入。图1为ResNet网络的连接机制,作为对比,图2为DenseNet的密集连接机制。可以看到,ResNet是每个层与前面的某层(一般是2~3层)短路连接在一起,连接方式是通过元素级相加。而在DenseNet中,每个层都会与前面所有层在channel维度上连接(concat)在一起(这里各个层的特征图大小是相同的,后面会有说明),并作为下一层的输入。对于一个 &img src=&https://www.zhihu.com/equation?tex=L& alt=&L& eeimg=&1&& 层的网络,DenseNet共包含 &img src=&https://www.zhihu.com/equation?tex=%5Cfrac%7BL%28L%2B1%29%7D%7B2%7D& alt=&\frac{L(L+1)}{2}& eeimg=&1&& 个连接,相比ResNet,这是一种密集连接。而且DenseNet是直接concat来自不同层的特征图,这可以实现特征重用,提升效率,这一特点是DenseNet与ResNet最主要的区别。&/p&&figure&&img src=&https://pic4.zhimg.com/v2-862e1c2dcb24f10dad38142_b.jpg& data-size=&normal& data-rawwidth=&1055& data-rawheight=&365& class=&origin_image zh-lightbox-thumb& width=&1055& data-original=&https://pic4.zhimg.com/v2-862e1c2dcb24f10dad38142_r.jpg&&&figcaption&图1 ResNet网络的短路连接机制(其中+代表的是元素级相加操作)&/figcaption&&/figure&&figure&&img src=&https://pic2.zhimg.com/v2-2cb01c1c9a217e56c72f4c24096fe3fe_b.jpg& data-size=&normal& data-rawwidth=&1067& data-rawheight=&498& class=&origin_image zh-lightbox-thumb& width=&1067& data-original=&https://pic2.zhimg.com/v2-2cb01c1c9a217e56c72f4c24096fe3fe_r.jpg&&&figcaption& 图2 DenseNet网络的密集连接机制(其中c代表的是channel级连接操作)&/figcaption&&/figure&&p&如果用公式表示的话,传统的网络在 &img src=&https://www.zhihu.com/equation?tex=l& alt=&l& eeimg=&1&& 层的输出为:&br&&img src=&https://www.zhihu.com/equation?tex=%5C%5Cx_l+%3D+H_l%28x_%7Bl-1%7D%29& alt=&\\x_l = H_l(x_{l-1})& eeimg=&1&&&/p&&p&而对于ResNet,增加了来自上一层输入的identity函数:&br&&img src=&https://www.zhihu.com/equation?tex=%5C%5Cx_l+%3D+H_l%28x_%7Bl-1%7D%29+%2B+x_%7Bl-1%7D& alt=&\\x_l = H_l(x_{l-1}) + x_{l-1}& eeimg=&1&&&/p&&p&在DenseNet中,会连接前面所有层作为输入:&br&&img src=&https://www.zhihu.com/equation?tex=%5C%5Cx_l+%3D+H_l%28%5Bx_0%2C+x_1%2C+...%2C+x_%7Bl-1%7D%5D%29& alt=&\\x_l = H_l([x_0, x_1, ..., x_{l-1}])& eeimg=&1&&&/p&&p&其中,上面的 &img src=&https://www.zhihu.com/equation?tex=H_l%28%5Ccdot%29& alt=&H_l(\cdot)& eeimg=&1&& 代表是非线性转化函数(non-liear transformation),它是一个组合操作,其可能包括一系列的BN(Batch Normalization),ReLU,Pooling及Conv操作。注意这里 &img src=&https://www.zhihu.com/equation?tex=l& alt=&l& eeimg=&1&& 层与 &img src=&https://www.zhihu.com/equation?tex=l-1& alt=&l-1& eeimg=&1&& 层之间可能实际上包含多个卷积层。&/p&&p&DenseNet的前向过程如图3所示,可以更直观地理解其密集连接方式,比如 &img src=&https://www.zhihu.com/equation?tex=h_3& alt=&h_3& eeimg=&1&& 的输入不仅包括来自 &img src=&https://www.zhihu.com/equation?tex=h_2& alt=&h_2& eeimg=&1&& 的 &img src=&https://www.zhihu.com/equation?tex=x_2& alt=&x_2& eeimg=&1&& ,还包括前面两层的 &img src=&https://www.zhihu.com/equation?tex=x_1& alt=&x_1& eeimg=&1&& 和 &img src=&https://www.zhihu.com/equation?tex=x_2& alt=&x_2& eeimg=&1&& ,它们是在channel维度上连接在一起的。&/p&&figure&&img src=&https://pic3.zhimg.com/v2-0a9db078f505baee9c27605_b.jpg& data-size=&normal& data-rawwidth=&1120& data-rawheight=&252& class=&origin_image zh-lightbox-thumb& width=&1120& data-original=&https://pic3.zhimg.com/v2-0a9db078f505baee9c27605_r.jpg&&&figcaption&图3 DenseNet的前向过程&/figcaption&&/figure&&p&CNN网络一般要经过Pooling或者stride&1的Conv来降低特征图的大小,而DenseNet的密集连接方式需要特征图大小保持一致。为了解决这个问题,DenseNet网络中使用DenseBlock+Transition的结构,其中DenseBlock是包含很多层的模块,每个层的特征图大小相同,层与层之间采用密集连接方式。而Transition模块是连接两个相邻的DenseBlock,并且通过Pooling使特征图大小降低。图4给出了DenseNet的网路结构,它共包含4个DenseBlock,各个DenseBlock之间通过Transition连接在一起。&/p&&figure&&img src=&https://pic3.zhimg.com/v2-edbe849a8cee707cb83bf_b.jpg& data-size=&normal& data-rawwidth=&6294& data-rawheight=&1610& class=&origin_image zh-lightbox-thumb& width=&6294& data-original=&https://pic3.zhimg.com/v2-edbe849a8cee707cb83bf_r.jpg&&&figcaption&图4 使用DenseBlock+Transition的DenseNet网络&/figcaption&&/figure&&h2&网络结构&/h2&&p&如前所示,DenseNet的网络结构主要由DenseBlock和Transition组成,如图5所示。下面具体介绍网络的具体实现细节。&/p&&figure&&img src=&https://pic2.zhimg.com/v2-0b28a49f274da9bd8dec2dccddf1ec53_b.jpg& data-size=&normal& data-rawwidth=&1439& data-rawheight=&425& class=&origin_image zh-lightbox-thumb& width=&1439& data-original=&https://pic2.zhimg.com/v2-0b28a49f274da9bd8dec2dccddf1ec53_r.jpg&&&figcaption&图6 DenseNet的网络结构 &/figcaption&&/figure&&p&在DenseBlock中,各个层的特征图大小一致,可以在channel维度上连接。DenseBlock中的非线性组合函数 &img src=&https://www.zhihu.com/equation?tex=H%28%5Ccdot%29& alt=&H(\cdot)& eeimg=&1&& 采用的是&b&BN+ReLU+3x3 Conv&/b&的结构,如图6所示。另外值得注意的一点是,与ResNet不同,所有DenseBlock中各个层卷积之后均输出 &img src=&https://www.zhihu.com/equation?tex=k& alt=&k& eeimg=&1&& 个特征图,即得到的特征图的channel数为 &img src=&https://www.zhihu.com/equation?tex=k& alt=&k& eeimg=&1&& ,或者说采用 &img src=&https://www.zhihu.com/equation?tex=k& alt=&k& eeimg=&1&& 个卷积核。 &img src=&https://www.zhihu.com/equation?tex=k& alt=&k& eeimg=&1&& 在DenseNet称为growth rate,这是一个超参数。一般情况下使用较小的 &img src=&https://www.zhihu.com/equation?tex=k& alt=&k& eeimg=&1&& (比如12),就可以得到较佳的性能。假定输入层的特征图的channel数为 &img src=&https://www.zhihu.com/equation?tex=k_0& alt=&k_0& eeimg=&1&& ,那么 &img src=&https://www.zhihu.com/equation?tex=l& alt=&l& eeimg=&1&& 层输入的channel数为 &img src=&https://www.zhihu.com/equation?tex=k_0%2Bk%28l-1%29& alt=&k_0+k(l-1)& eeimg=&1&& ,因此随着层数增加,尽管 &img src=&https://www.zhihu.com/equation?tex=k& alt=&k& eeimg=&1&& 设定得较小,DenseBlock的输入会非常多,不过这是由于特征重用所造成的,每个层仅有 &img src=&https://www.zhihu.com/equation?tex=k& alt=&k& eeimg=&1&& 个特征是自己独有的。&/p&&figure&&img src=&https://pic4.zhimg.com/v2-944ea287bab_b.jpg& data-size=&normal& data-rawwidth=&634& data-rawheight=&350& class=&origin_image zh-lightbox-thumb& width=&634& data-original=&https://pic4.zhimg.com/v2-944ea287bab_r.jpg&&&figcaption&图6 DenseBlock中的非线性转换结构&/figcaption&&/figure&&p&由于后面层的输入会非常大,DenseBlock内部可以采用bottleneck层来减少计算量,主要是原有的结构中增加1x1 Conv,如图7所示,即&b&BN+ReLU+1x1 Conv+BN+ReLU+3x3 Conv&/b&,称为DenseNet-B结构。其中1x1 Conv得到 &img src=&https://www.zhihu.com/equation?tex=4k& alt=&4k& eeimg=&1&& 个特征图它起到的作用是降低特征数量,从而提升计算效率。&/p&&figure&&img src=&https://pic1.zhimg.com/v2-4de2c07f516b030b34dc8c_b.jpg& data-size=&normal& data-rawwidth=&1136& data-rawheight=&383& class=&origin_image zh-lightbox-thumb& width=&1136& data-original=&https://pic1.zhimg.com/v2-4de2c07f516b030b34dc8c_r.jpg&&&figcaption&图7 使用bottleneck层的DenseBlock结构&/figcaption&&/figure&&p&对于Transition层,它主要是连接两个相邻的DenseBlock,并且降低特征图大小。Transition层包括一个1x1的卷积和2x2的AvgPooling,结构为&b&BN+ReLU+1x1 Conv+2x2 AvgPooling&/b&。另外,Transition层可以起到压缩模型的作用。假定Transition的上接DenseBlock得到的特征图channels数为 &img src=&https://www.zhihu.com/equation?tex=m& alt=&m& eeimg=&1&& ,Transition层可以产生 &img src=&https://www.zhihu.com/equation?tex=%5Clfloor%5Ctheta+m%5Crfloor& alt=&\lfloor\theta m\rfloor& eeimg=&1&& 个特征(通过卷积层),其中 &img src=&https://www.zhihu.com/equation?tex=%5Ctheta+%5Cin+%280%2C1%5D& alt=&\theta \in (0,1]& eeimg=&1&& 是压缩系数(compression rate)。当 &img src=&https://www.zhihu.com/equation?tex=%5Ctheta%3D1& alt=&\theta=1& eeimg=&1&& 时,特征个数经过Transition层没有变化,即无压缩,而当压缩系数小于1时,这种结构称为DenseNet-C,文中使用 &img src=&https://www.zhihu.com/equation?tex=%5Ctheta%3D0.5& alt=&\theta=0.5& eeimg=&1&&
。对于使用bottleneck层的DenseBlock结构和压缩系数小于1的Transition组合结构称为DenseNet-BC。&/p&&p&DenseNet共在三个图像分类数据集(CIFAR,SVHN和ImageNet)上进行测试。对于前两个数据集,其输入图片大小为 &img src=&https://www.zhihu.com/equation?tex=32%5Ctimes+32& alt=&32\times 32& eeimg=&1&& ,所使用的DenseNet在进入第一个DenseBlock之前,首先进行进行一次3x3卷积(stride=1),卷积核数为16(对于DenseNet-BC为 &img src=&https://www.zhihu.com/equation?tex=2k& alt=&2k& eeimg=&1&& )。DenseNet共包含三个DenseBlock,各个模块的特征图大小分别为 &img src=&https://www.zhihu.com/equation?tex=32%5Ctimes+32& alt=&32\times 32& eeimg=&1&& , &img src=&https://www.zhihu.com/equation?tex=16%5Ctimes+16& alt=&16\times 16& eeimg=&1&& 和 &img src=&https://www.zhihu.com/equation?tex=8%5Ctimes+8& alt=&8\times 8& eeimg=&1&& ,每个DenseBlock里面的层数相同。最后的DenseBlock之后是一个global AvgPooling层,然后送入一个softmax分类器。注意,在DenseNet中,所有的3x3卷积均采用padding=1的方式以保证特征图大小维持不变。对于基本的DenseNet,使用如下三种网络配置: &img src=&https://www.zhihu.com/equation?tex=%5C%7BL%3D40%2C+k%3D12%5C%7D& alt=&\{L=40, k=12\}& eeimg=&1&& , &img src=&https://www.zhihu.com/equation?tex=%5C%7BL%3D100%2C+k%3D12%5C%7D& alt=&\{L=100, k=12\}& eeimg=&1&& , &img src=&https://www.zhihu.com/equation?tex=%5C%7BL%3D40%2C+k%3D24%5C%7D& alt=&\{L=40, k=24\}& eeimg=&1&& 。而对于DenseNet-BC结构,使用如下三种网络配置: &img src=&https://www.zhihu.com/equation?tex=%5C%7BL%3D100%2C+k%3D12%5C%7D& alt=&\{L=100, k=12\}& eeimg=&1&& , &img src=&https://www.zhihu.com/equation?tex=%5C%7BL%3D250%2C+k%3D24%5C%7D& alt=&\{L=250, k=24\}& eeimg=&1&& , &img src=&https://www.zhihu.com/equation?tex=%5C%7BL%3D190%2C+k%3D40%5C%7D& alt=&\{L=190, k=40\}& eeimg=&1&& 。这里的 &img src=&https://www.zhihu.com/equation?tex=L& alt=&L& eeimg=&1&& 指的是网络总层数(网络深度),一般情况下,我们只把带有训练参数的层算入其中,而像Pooling这样的无参数层不纳入统计中,此外BN层尽管包含参数但是也不单独统计,而是可以计入它所附属的卷积层。对于普通的 &img src=&https://www.zhihu.com/equation?tex=%7BL%3D40%2C+k%3D12%7D& alt=&{L=40, k=12}& eeimg=&1&& 网络,除去第一个卷积层、2个Transition中卷积层以及最后的Linear层,共剩余36层,均分到三个DenseBlock可知每个DenseBlock包含12层。其它的网络配置同样可以算出各个DenseBlock所含层数。&/p&&p&对于ImageNet数据集,图片输入大小为 &img src=&https://www.zhihu.com/equation?tex=224%5Ctimes+224& alt=&224\times 224& eeimg=&1&& ,网络结构采用包含4个DenseBlock的DenseNet-BC,其首先是一个stride=2的7x7卷积层(卷积核数为 &img src=&https://www.zhihu.com/equation?tex=2k& alt=&2k& eeimg=&1&& ),然后是一个stride=2的3x3 MaxPooling层,后面才进入DenseBlock。ImageNet数据集所采用的网络配置如表1所示:&/p&&figure&&img src=&https://pic1.zhimg.com/v2-c712b7a04200ecfd05c79478adb18888_b.jpg& data-size=&normal& data-rawwidth=&1212& data-rawheight=&589& class=&origin_image zh-lightbox-thumb& width=&1212& data-original=&https://pic1.zhimg.com/v2-c712b7a04200ecfd05c79478adb18888_r.jpg&&&figcaption&表1 ImageNet数据集上所采用的DenseNet结构&/figcaption&&/figure&&h2&实验结果及讨论&/h2&&p&这里给出DenseNet在CIFAR-100和ImageNet数据集上与ResNet的对比结果,如图8和9所示。从图8中可以看到,只有0.8M的DenseNet-100性能已经超越ResNet-1001,并且后者参数大小为10.2M。而从图9中可以看出,同等参数大小时,DenseNet也优于ResNet网络。其它实验结果见原论文。&/p&&figure&&img src=&https://pic1.zhimg.com/v2-4d3af216c0ebac32c2da49_b.jpg& data-size=&normal& data-rawwidth=&1073& data-rawheight=&503& class=&origin_image zh-lightbox-thumb& width=&1073& data-original=&https://pic1.zhimg.com/v2-4d3af216c0ebac32c2da49_r.jpg&&&figcaption&图8 在CIFAR-100数据集上ResNet vs DenseNet&/figcaption&&/figure&&figure&&img src=&https://pic3.zhimg.com/v2-3ca28afb522f80cabadb1_b.jpg& data-size=&normal& data-rawwidth=&978& data-rawheight=&504& class=&origin_image zh-lightbox-thumb& width=&978& data-original=&https://pic3.zhimg.com/v2-3ca28afb522f80cabadb1_r.jpg&&&figcaption&图9 在ImageNet数据集上ResNet vs DenseNet&/figcaption&&/figure&&p&综合来看,DenseNet的优势主要体现在以下几个方面:&/p&&ul&&li&由于密集连接方式,DenseNet提升了梯度的反向传播,使得网络更容易训练。由于每层可以直达最后的误差信号,实现了隐式的&a href=&https://link.zhihu.com/?target=https%3A//arxiv.org/abs/& class=& wrap external& target=&_blank& rel=&nofollow noreferrer&&“deep supervision”&/a&;&/li&&li&参数更小且计算更高效,这有点违反直觉,由于DenseNet是通过concat特征来实现短路连接,实现了特征重用,并且采用较小的growth rate,每个层所独有的特征图是比较小的;&/li&&li&由于特征复用,最后的分类器使用了低级特征。&/li&&/ul&&p&要注意的一点是,如果实现方式不当的话,DenseNet可能耗费很多GPU显存,一种高效的实现如图10所示,更多细节可以见这篇论文&a href=&https://link.zhihu.com/?target=https%3A//arxiv.org/abs/& class=& wrap external& target=&_blank& rel=&nofollow noreferrer&&Memory-Efficient Implementation of DenseNets&/a&。不过我们下面使用Pytorch框架可以自动实现这种优化。&/p&&figure&&img src=&https://pic3.zhimg.com/v2-5fd6e40dd06bfc05b083_b.jpg& data-size=&normal& data-rawwidth=&984& data-rawheight=&522& class=&origin_image zh-lightbox-thumb& width=&984& data-original=&https://pic3.zhimg.com/v2-5fd6e40dd06bfc05b083_r.jpg&&&figcaption&图10 DenseNet的更高效实现方式&/figcaption&&/figure&&h2&使用Pytorch实现DenseNet&/h2&&p&这里我们采用&a href=&https://link.zhihu.com/?target=https%3A//pytorch.org/& class=& wrap external& target=&_blank& rel=&nofollow noreferrer&&Pytorch&/a&框架来实现DenseNet,目前它已经支持Windows系统。对于DenseNet,Pytorch在&a href=&https://link.zhihu.com/?target=https%3A//github.com/pytorch/vision/tree/master/torchvision/models& class=& wrap external& target=&_blank& rel=&nofollow noreferrer&&torchvision.models&/a&模块里给出了官方实现,这个DenseNet版本是用于ImageNet数据集的DenseNet-BC模型,下面简单介绍实现过程。&/p&&p&首先实现DenseBlock中的内部结构,这里是&b&BN+ReLU+1x1 Conv+BN+ReLU+3x3 Conv&/b&结构,最后也加入dropout层以用于训练过程。&/p&&div class=&highlight&&&pre&&code class=&language-python&&&span&&/span&&span class=&k&&class&/span& &span class=&nc&&_DenseLayer&/span&&span class=&p&&(&/span&&span class=&n&&nn&/span&&span class=&o&&.&/span&&span class=&n&&Sequential&/span&&span class=&p&&):&/span&
&span class=&sd&&&&&Basic unit of DenseBlock (using bottleneck layer) &&&&/span&
&span class=&k&&def&/span& &span class=&nf&&__init__&/span&&span class=&p&&(&/span&&span class=&bp&&self&/span&&span class=&p&&,&/span& &span class=&n&&num_input_features&/span&&span class=&p&&,&/span& &span class=&n&&growth_rate&/span&&span class=&p&&,&/span& &span class=&n&&bn_size&/span&&span class=&p&&,&/span& &span class=&n&&drop_rate&/span&&span class=&p&&):&/span&
&span class=&nb&&super&/span&&span class=&p&&(&/span&&span class=&n&&_DenseLayer&/span&&span class=&p&&,&/span& &span class=&bp&&self&/span&&span class=&p&&)&/span&&span class=&o&&.&/span&&span class=&n&&__init__&/span&&span class=&p&&()&/span&
&span class=&bp&&self&/span&&span class=&o&&.&/span&&span class=&n&&add_module&/span&&span class=&p&&(&/span&&span class=&s2&&&norm1&&/span&&span class=&p&&,&/span& &span class=&n&&nn&/span&&span class=&o&&.&/span&&span class=&n&&BatchNorm2d&/span&&span class=&p&&(&/span&&span class=&n&&num_input_features&/span&&span class=&p&&))&/span&
&span class=&bp&&self&/span&&span class=&o&&.&/span&&span class=&n&&add_module&/span&&span class=&p&&(&/span&&span class=&s2&&&relu1&&/span&&span class=&p&&,&/span& &span class=&n&&nn&/span&&span class=&o&&.&/span&&span class=&n&&ReLU&/span&&span class=&p&&(&/span&&span class=&n&&inplace&/span&&span class=&o&&=&/span&&span class=&bp&&True&/span&&span class=&p&&))&/span&
&span class=&bp&&self&/span&&span class=&o&&.&/span&&span class=&n&&add_module&/span&&span class=&p&&(&/span&&span class=&s2&&&conv1&&/span&&span class=&p&&,&/span& &span class=&n&&nn&/span&&span class=&o&&.&/span&&span class=&n&&Conv2d&/span&&span class=&p&&(&/span&&span class=&n&&num_input_features&/span&&span class=&p&&,&/span& &span class=&n&&bn_size&/span&&span class=&o&&*&/span&&span class=&n&&growth_rate&/span&&span class=&p&&,&/span&
&span class=&n&&kernel_size&/span&&span class=&o&&=&/span&&span class=&mi&&1&/span&&span class=&p&&,&/span& &span class=&n&&stride&/span&&span class=&o&&=&/span&&span class=&mi&&1&/span&&span class=&p&&,&/span& &span class=&n&&bias&/span&&span class=&o&&=&/span&&span class=&bp&&False&/span&&span class=&p&&))&/span&
&span class=&bp&&self&/span&&span class=&o&&.&/span&&span class=&n&&add_module&/span&&span class=&p&&(&/span&&span class=&s2&&&norm2&&/span&&span class=&p&&,&/span& &span class=&n&&nn&/span&&span class=&o&&.&/span&&span class=&n&&BatchNorm2d&/span&&span class=&p&&(&/span&&span class=&n&&bn_size&/span&&span class=&o&&*&/span&&span class=&n&&growth_rate&/span&&span class=&p&&))&/span&
&span class=&bp&&self&/span&&span class=&o&&.&/span&&span class=&n&&add_module&/span&&span class=&p&&(&/span&&span class=&s2&&&relu2&&/span&&span class=&p&&,&/span& &span class=&n&&nn&/span&&span class=&o&&.&/span&&span class=&n&&ReLU&/span&&span class=&p&&(&/span&&span class=&n&&inplace&/span&&span class=&o&&=&/span&&span class=&bp&&True&/span&&span class=&p&&))&/span&
&span class=&bp&&self&/span&&span class=&o&&.&/span&&span class=&n&&add_module&/span&&span class=&p&&(&/span&&span class=&s2&&&conv2&&/span&&span class=&p&&,&/span& &span class=&n&&nn&/span&&span class=&o&&.&/span&&span class=&n&&Conv2d&/span&&span class=&p&&(&/span&&span class=&n&&bn_size&/span&&span class=&o&&*&/span&&span class=&n&&growth_rate&/span&&span class=&p&&,&/span& &span class=&n&&growth_rate&/span&&span class=&p&&,&/span&
&span class=&n&&kernel_size&/span&&span class=&o&&=&/span&&span class=&mi&&3&/span&&span class=&p&&,&/span& &span class=&n&&stride&/span&&span class=&o&&=&/span&&span class=&mi&&1&/span&&span class=&p&&,&/span& &span class=&n&&padding&/span&&span class=&o&&=&/span&&span class=&mi&&1&/span&&span class=&p&&,&/span& &span class=&n&&bias&/span&&span class=&o&&=&/span&&span class=&bp&&False&/span&&span class=&p&&))&/span&
&span class=&bp&&self&/span&&span class=&o&&.&/span&&span class=&n&&drop_rate&/span& &span class=&o&&=&/span& &span class=&n&&drop_rate&/span&
&span class=&k&&def&/span& &span class=&nf&&forward&/span&&span class=&p&&(&/span&&span class=&bp&&self&/span&&span class=&p&&,&/span& &span class=&n&&x&/span&&span class=&p&&):&/span&
&span class=&n&&new_features&/span& &span class=&o&&=&/span& &span class=&nb&&super&/span&&span class=&p&&(&/span&&span class=&n&&_DenseLayer&/span&&span class=&p&&,&/span& &span class=&bp&&self&/span&&span class=&p&&)&/span&&span class=&o&&.&/span&&span class=&n&&forward&/span&&span class=&p&&(&/span&&span class=&n&&x&/span&&span class=&p&&)&/span&
&span class=&k&&if&/span& &span class=&bp&&self&/span&&span class=&o&&.&/span&&span class=&n&&drop_rate&/span& &span class=&o&&&&/span& &span class=&mi&&0&/span&&span class=&p&&:&/span&
&span class=&n&&new_features&/span& &span class=&o&&=&/span& &span class=&n&&F&/span&&span class=&o&&.&/span&&span class=&n&&dropout&/span&&span class=&p&&(&/span&&span class=&n&&new_features&/span&&span class=&p&&,&/span& &span class=&n&&p&/span&&span class=&o&&=&/span&&span class=&bp&&self&/span&&span class=&o&&.&/span&&span class=&n&&drop_rate&/span&&span class=&p&&,&/span& &span class=&n&&training&/span&&span class=&o&&=&/span&&span class=&bp&&self&/span&&span class=&o&&.&/span&&span class=&n&&training&/span&&span class=&p&&)&/span&
&span class=&k&&return&/span& &span class=&n&&torch&/span&&span class=&o&&.&/span&&span class=&n&&cat&/span&&span class=&p&&([&/span&&span class=&n&&x&/span&&span class=&p&&,&/span& &span class=&n&&new_features&/span&&span class=&p&&],&/span& &span class=&mi&&1&/span&&span class=&p&&)&/span&
&/code&&/pre&&/div&&p&据此,实现DenseBlock模块,内部是密集连接方式(输入特征数线性增长):&/p&&div class=&highlight&&&pre&&code class=&language-python&&&span&&/span&&span class=&k&&class&/span& &span class=&nc&&_DenseBlock&/span&&span class=&p&&(&/span&&span class=&n&&nn&/span&&span class=&o&&.&/span&&span class=&n&&Sequential&/span&&span class=&p&&):&/span&
&span class=&sd&&&&&DenseBlock&&&&/span&
&span class=&k&&def&/span& &span class=&nf&&__init__&/span&&span class=&p&&(&/span&&span class=&bp&&self&/span&&span class=&p&&,&/span& &span class=&n&&num_layers&/span&&span class=&p&&,&/span& &span class=&n&&num_input_features&/span&&span class=&p&&,&/span& &span class=&n&&bn_size&/span&&span class=&p&&,&/span& &span class=&n&&growth_rate&/span&&span class=&p&&,&/span& &span class=&n&&drop_rate&/span&&span class=&p&&):&/span&
&span class=&nb&&super&/span&&span class=&p&&(&/span&&span class=&n&&_DenseBlock&/span&&span class=&p&&,&/span& &span class=&bp&&self&/span&&span class=&p&&)&/span&&span class=&o&&.&/span&&span class=&n&&__init__&/span&&span class=&p&&()&/span&
&span class=&k&&for&/span& &span class=&n&&i&/span& &span class=&ow&&in&/span& &span class=&nb&&range&/span&&span class=&p&&(&/span&&span class=&n&&num_layers&/span&&span class=&p&&):&/span&
&span class=&n&&layer&/span& &span class=&o&&=&/span& &span class=&n&&_DenseLayer&/span&&span class=&p&&(&/span&&span class=&n&&num_input_features&/span&&span class=&o&&+&/span&&span class=&n&&i&/span&&span class=&o&&*&/span&&span class=&n&&growth_rate&/span&&span class=&p&&,&/span& &span class=&n&&growth_rate&/span&&span class=&p&&,&/span& &span class=&n&&bn_size&/span&&span class=&p&&,&/span&
&span class=&n&&drop_rate&/span&&span class=&p&&)&/span&
&span class=&bp&&self&/span&&span class=&o&&.&/span&&span class=&n&&add_module&/span&&span class=&p&&(&/span&&span class=&s2&&&denselayer&/span&&span class=&si&&%d&/span&&span class=&s2&&&&/span& &span class=&o&&%&/span& &span class=&p&&(&/span&&span class=&n&&i&/span&&span class=&o&&+&/span&&span class=&mi&&1&/span&&span class=&p&&,),&/span& &span class=&n&&layer&/span&&span class=&p&&)&/span&
&/code&&/pre&&/div&&p&此外,我们实现Transition层,它主要是一个卷积层和一个池化层:&/p&&div class=&highlight&&&pre&&code class=&language-python&&&span&&/span&&span class=&k&&class&/span& &span class=&nc&&_Transition&/span&&span class=&p&&(&/span&&span class=&n&&nn&/span&&span class=&o&&.&/span&&span class=&n&&Sequential&/span&&span class=&p&&):&/span&
&span class=&sd&&&&&Transition layer between two adjacent DenseBlock&&&&/span&
&span class=&k&&def&/span& &span class=&nf&&__init__&/span&&span class=&p&&(&/span&&span class=&bp&&self&/span&&span class=&p&&,&/span& &span class=&n&&num_input_feature&/span&&span class=&p&&,&/span& &span class=&n&&num_output_features&/span&&span class=&p&&):&/span&
&span class=&nb&&super&/span&&span class=&p&&(&/span&&span class=&n&&_Transition&/span&&span class=&p&&,&/span& &span class=&bp&&self&/span&&span class=&p&&)&/span&&span class=&o&&.&/span&&span class=&n&&__init__&/span&&span class=&p&&()&/span&
&span class=&bp&&self&/span&&span class=&o&&.&/span&&span class=&n&&add_module&/span&&span class=&p&&(&/span&&span class=&s2&&&norm&&/span&&span class=&p&&,&/span& &span class=&n&&nn&/span&&span class=&o&&.&/span&&span class=&n&&BatchNorm2d&/span&&span class=&p&&(&/span&&span class=&n&&num_input_feature&/span&&span class=&p&&))&/span&
&span class=&bp&&self&/span&&span class=&o&&.&/span&&span class=&n&&add_module&/span&&span class=&p&&(&/span&&span class=&s2&&&relu&&/span&&span class=&p&&,&/span& &span class=&n&&nn&/span&&span class=&o&&.&/span&&span class=&n&&ReLU&/span&&span class=&p&&(&/span&&span class=&n&&inplace&/span&&span class=&o&&=&/span&&span class=&bp&&True&/span&&span class=&p&&))&/span&
&span class=&bp&&self&/span&&span class=&o&&.&/span&&span class=&n&&add_module&/span&&span class=&p&&(&/span&&span class=&s2&&&conv&&/span&&span class=&p&&,&/span& &span class=&n&&nn&/span&&span class=&o&&.&/span&&span class=&n&&Conv2d&/span&&span class=&p&&(&/span&&span class=&n&&num_input_feature&/span&&span class=&p&&,&/span& &span class=&n&&num_output_features&/span&&span class=&p&&,&/span&
&span class=&n&&kernel_size&/span&&span class=&o&&=&/span&&span class=&mi&&1&/span&&span class=&p&&,&/span& &span class=&n&&stride&/span&&span class=&o&&=&/span&&span class=&mi&&1&/span&&span class=&p&&,&/span& &span class=&n&&bias&/span&&span class=&o&&=&/span&&span class=&bp&&False&/span&&span class=&p&&))&/span&
&span class=&bp&&self&/span&&span class=&o&&.&/span&&span class=&n&&add_module&/span&&span class=&p&&(&/span&&span class=&s2&&&pool&&/span&&span class=&p&&,&/span& &span class=&n&&nn&/span&&span class=&o&&.&/span&&span class=&n&&AvgPool2d&/span&&span class=&p&&(&/span&&span class=&mi&&2&/span&&span class=&p&&,&/span& &span class=&n&&stride&/span&&span class=&o&&=&/span&&span class=&mi&&2&/span&&span class=&p&&))&/span&
&/code&&/pre&&/div&&p&最后我们实现DenseNet网络:&/p&&div class=&highlight&&&pre&&code class=&language-python&&&span&&/span&&span class=&k&&class&/span& &span class=&nc&&DenseNet&/span&&span class=&p&&(&/span&&span class=&n&&nn&/span&&span class=&o&&.&/span&&span class=&n&&Module&/span&&span class=&p&&):&/span&
&span class=&s2&&&DenseNet-BC model&&/span&
&span class=&k&&def&/span& &span class=&nf&&__init__&/span&&span class=&p&&(&/span&&span class=&bp&&self&/span&&span class=&p&&,&/span& &span class=&n&&growth_rate&/span&&span class=&o&&=&/span&&span class=&mi&&32&/span&&span class=&p&&,&/span& &span class=&n&&block_config&/span&&span class=&o&&=&/span&&span class=&p&&(&/span&&span class=&mi&&6&/span&&span class=&p&&,&/span& &span class=&mi&&12&/span&&span class=&p&&,&/span& &span class=&mi&&24&/span&&span class=&p&&,&/span& &span class=&mi&&16&/span&&span class=&p&&),&/span& &span class=&n&&num_init_features&/span&&span class=&o&&=&/span&&span class=&mi&&64&/span&&span class=&p&&,&/span&
&span class=&n&&bn_size&/span&&span class=&o&&=&/span&&span class=&mi&&4&/span&&span class=&p&&,&/span& &span class=&n&&compression_rate&/span&&span class=&o&&=&/span&&span class=&mf&&0.5&/span&&span class=&p&&,&/span& &span class=&n&&drop_rate&/span&&span class=&o&&=&/span&&span class=&mi&&0&/span&&span class=&p&&,&/span& &span class=&n&&num_classes&/span&&span class=&o&&=&/span&&span class=&mi&&1000&/span&&span class=&p&&):&/span&
&span class=&sd&&&&&&/span&
&span class=&sd&&
:param growth_rate: (int) number of filters used in DenseLayer, `k` in the paper&/span&
&span class=&sd&&
:param block_config: (list of 4 ints) number of layers in each DenseBlock&/span&
&span class=&sd&&
:param num_init_features: (int) number of filters in the first Conv2d&/span&
&span class=&sd&&
:param bn_size: (int) the factor using in the bottleneck layer&/span&
&span class=&sd&&
:param compression_rate: (float) the compression rate used in Transition Layer&/span&
&span class=&sd&&
:param drop_rate: (float) the drop rate after each DenseLayer&/span&
&span class=&sd&&
:param num_classes: (int) number of classes for classification&/span&
&span class=&sd&&
&&&&/span&
&span class=&nb&&super&/span&&span class=&p&&(&/span&&span class=&n&&DenseNet&/span&&span class=&p&&,&/span& &span class=&bp&&self&/span&&span class=&p&&)&/span&&span class=&o&&.&/span&&span class=&n&&__init__&/span&&span class=&p&&()&/span&
&span class=&c1&&# first Conv2d&/span&
&span class=&bp&&self&/span&&span class=&o&&.&/span&&span class=&n&&features&/span& &span class=&o&&=&/span& &span class=&n&&nn&/span&&span class=&o&&.&/span&&span class=&n&&Sequential&/span&&span class=&p&&(&/span&&span class=&n&&OrderedDict&/span&&span class=&p&&([&/span&
&span class=&p&&(&/span&&span class=&s2&&&conv0&&/span&&span class=&p&&,&/span& &span class=&n&&nn&/span&&span class=&o&&.&/span&&span class=&n&&Conv2d&/span&&span class=&p&&(&/span&&span class=&mi&&3&/span&&span class=&p&&,&/span& &span class=&n&&num_init_features&/span&&span class=&p&&,&/span& &span class=&n&&kernel_size&/span&&span class=&o&&=&/span&&span class=&mi&&7&/span&&span class=&p&&,&/span& &span class=&n&&stride&/span&&span class=&o&&=&/span&&span class=&mi&&2&/span&&span class=&p&&,&/span& &span class=&n&&padding&/span&&span class=&o&&=&/span&&span class=&mi&&3&/span&&span class=&p&&,&/span& &span class=&n&&bias&/span&&span class=&o&&=&/span&&span class=&bp&&False&/span&&span class=&p&&)),&/span&
&span class=&p&&(&/span&&span class=&s2&&&norm0&&/span&&span class=&p&&,&/span& &span class=&n&&nn&/span&&span class=&o&&.&/span&&span class=&n&&BatchNorm2d&/span&&span class=&p&&(&/span&&span class=&n&&num_init_features&/span&&span class=&p&&)),&/span&
&span class=&p&&(&/span&&span class=&s2&&&relu0&&/span&&span class=&p&&,&/span& &span class=&n&&nn&/span&&span class=&o&&.&/span&&span class=&n&&ReLU&/span&&span class=&p&&(&/span&&span class=&n&&inplace&/span&&span class=&o&&=&/span&&span class=&bp&&True&/span&&span class=&p&&)),&/span&
&span class=&p&&(&/span&&span class=&s2&&&pool0&&/span&&span class=&p&&,&/span& &span class=&n&&nn&/span&&span class=&o&&.&/span&&span class=&n&&MaxPool2d&/span&&span class=&p&&(&/span&&span class=&mi&&3&/span&&span class=&p&&,&/span& &span class=&n&&stride&/span&&span class=&o&&=&/span&&span class=&mi&&2&/span&&span class=&p&&,&/span& &span class=&n&&padding&/span&&span class=&o&&=&/span&&span class=&mi&&1&/span&&span class=&p&&))&/span&
&span class=&p&&]))&/span&
&span class=&c1&&# DenseBlock&/span&
&span class=&n&&num_features&/span& &span class=&o&&=&/span& &span class=&n&&num_init_features&/span&
&span class=&k&&for&/span& &span class=&n&&i&/span&&span class=&p&&,&/span& &span class=&n&&num_layers&/span& &span class=&ow&&in&/span& &span class=&nb&&enumerate&/span&&span class=&p&&(&/span&&span class=&n&&block_config&/span&&span class=&p&&):&/span&
&span class=&n&&block&/span& &span class=&o&&=&/span& &span class=&n&&_DenseBlock&/span&&span class=&p&&(&/span&&span class=&n&&num_layers&/span&&span class=&p&&,&/span& &span class=&n&&num_features&/span&&span class=&p&&,&/span& &span class=&n&&bn_size&/span&&span class=&p&&,&/span& &span class=&n&&growth_rate&/span&&span class=&p&&,&/span& &span class=&n&&drop_rate&/span&&span class=&p&&)&/span&
&span class=&bp&&self&/span&&span class=&o&&.&/span&&span class=&n&&features&/span&&span class=&o&&.&/span&&span class=&n&&add_module&/span&&span class=&p&&(&/span&&span class=&s2&&&denseblock&/span&&span class=&si&&%d&/span&&span class=&s2&&&&/span& &span class=&o&&%&/span& &span class=&p&&(&/span&&span class=&n&&i&/span& &span class=&o&&+&/span& &span class=&mi&&1&/span&&span class=&p&&),&/span& &span class=&n&&block&/span&&span class=&p&&)&/span&
&span class=&n&&num_features&/span& &span class=&o&&+=&/span& &span class=&n&&num_layers&/span&&span class=&o&&*&/span&&span class=&n&&growth_rate&/span&
&span class=&k&&if&/span& &span class=&n&&i&/span& &span class=&o&&!=&/span& &span class=&nb&&len&/span&&span class=&p&&(&/span&&span class=&n&&block_config&/span&&span class=&p&&)&/span& &span class=&o&&-&/span& &span class=&mi&&1&/span&&span class=&p&&:&/span&
&span class=&n&&transition&/span& &span class=&o&&=&/span& &span class=&n&&_Transition&/span&&span class=&p&&(&/span&&span class=&n&&num_features&/span&&span class=&p&&,&/span& &span class=&nb&&int&/span&&span class=&p&&(&/span&&span class=&n&&num_features&/span&&span class=&o&&*&/span&&span class=&n&&compression_rate&/span&&span class=&p&&))&/span&
&span class=&bp&&self&/span&&span class=&o&&.&/span&&span class=&n&&features&/span&&span class=&o&&.&/span&&span class=&n&&add_module&/span&&span class=&p&&(&/span&&span class=&s2&&&transition&/span&&span class=&si&&%d&/span&&span class=&s2&&&&/span& &span class=&o&&%&/span& &span class=&p&&(&/span&&span class=&n&&i&/span& &span class=&o&&+&/span& &span class=&mi&&1&/span&&span class=&p&&),&/span& &span class=&n&&transition&/span&&span class=&p&&)&/span&
&span class=&n&&num_features&/span& &span class=&o&&=&/span& &span class=&nb&&int&/span&&span class=&p&&(&/span&&span class=&n&&num_features&/span& &span class=&o&&*&/span& &span class=&n&&compression_rate&/span&&span class=&p&&)&/span&
&span class=&c1&&# final bn+ReLU&/span&
&span class=&bp&&self&/span&&span class=&o&&.&/span&&span class=&n&&features&/span&&span class=&o&&.&/span&&span class=&n&&add_module&/span&&span class=&p&&(&/span&&span class=&s2&&&norm5&&/span&&span class=&p&&,&/span& &span class=&n&&nn&/span&&span class=&o&&.&/span&&span class=&n&&BatchNorm2d&/span&&span class=&p&&(&/span&&span class=&n&&num_features&/span&&span class=&p&&))&/span&
&span class=&bp&&self&/span&&span class=&o&&.&/span&&span class=&n&&features&/span&&span class=&o&&.&/span&&span class=&n&&add_module&/span&&span class=&p&&(&/span&&span class=&s2&&&relu5&&/span&&span class=&p&&,&/span& &span class=&n&&nn&/span&&span class=&o&&.&/span&&span class=&n&&ReLU&/span&&span class=&p&&(&/span&&span class=&n&&inplace&/span&&span class=&o&&=&/span&&span class=&bp&&True&/span&&span class=&p&&))&/span&
&span class=&c1&&# classification layer&/span&
&span class=&bp&&self&/span&&span class=&o&&.&/span&&span class=&n&&classifier&/span& &span class=&o&&=&/span& &span class=&n&&nn&/span&&span class=&o&&.&/span&&span class=&n&&Linear&/span&&span class=&p&&(&/span&&span class=&n&&num_features&/span&&span class=&p&&,&/span& &span class=&n&&num_classes&/span&&span class=&p&&)&/span&
&span class=&c1&&# params initialization&/span&
&span class=&k&&for&/span& &span class=&n&&m&/span& &span class=&ow&&in&/span& &span class=&bp&&self&/span&&span class=&o&&.&/span&&span class=&n&&modules&/span&&span class=&p&&():&/span&
&span class=&k&&if&/span& &span class=&nb&&isinstance&/span&&span class=&p&&(&/span&&span class=&n&&m&/span&&span class=&p&&,&/span& &span class=&n&&nn&/span&&span class=&o&&.&/span&&span class=&n&&Conv2d&/span&&span class=&p&&):&/span&
&span class=&n&&nn&/span&&span class=&o&&.&/span&&span class=&n&&init&/span&&span class=&o&&.&/span&&span class=&n&&kaiming_normal_&/span&&span class=&p&&(&/span&&span class=&n&&m&/span&&span class=&o&&.&/span&&span class=&n&&weight&/span&&span class=&p&&)&/span&
&span class=&k&&elif&/span& &span class=&nb&&isinstance&/span&&span class=&p&&(&/span&&span class=&n&&m&/span&&span class=&p&&,&/span& &span class=&n&&nn&/span&&span class=&o&&.&/span&&span class=&n&&BatchNorm2d&/span&&span class=&p&&):&/span&
&span class=&n&&nn&/span&&span class=&o&&.&/span&&span class=&n&&init&/span&&span class=&o&&.&/span&&span class=&n&&constant_&/span&&span class=&p&&(&/span&&span class=&n&&m&/span&&span class=&o&&.&/span&&span class=&n&&bias&/span&&span class=&p&&,&/span& &span class=&mi&&0&/span&&span class=&p&&)&/span&
&span class=&n&&nn&/span&&span class=&o&&.&/span&&span class=&n&&init&/span&&span class=&o&&.&/span&&span class=&n&&constant_&/span&&span class=&p&&(&/span&&span class=&n&&m&/span&&span class=&o&&.&/span&&span class=&n&&weight&/span&&span class=&p&&,&/span& &span class=&mi&&1&/span&&span class=&p&&)&/span&
&span class=&k&&elif&/span& &span class=&nb&&isinstance&/span&&span class=&p&&(&/span&&span class=&n&&m&/span&&span class=&p&&,&/span& &span class=&n&&nn&/span&&span class=&o&&.&/span&&span class=&n&&Linear&/span&&span class=&p&&):&/span&
&span class=&n&&nn&/span&&span class=&o&&.&/span&&span class=&n&&init&/span&&span class=&o&&.&/span&&span class=&n&&constant_&/span&&span class=&p&&(&/span&&span class=&n&&m&/span&&span class=&o&&.&/span&&span class=&n&&bias&/span&&span class=&p&&,&/span& &span class=&mi&&0&/span&&span class=&p&&)&/span&
&span class=&k&&def&/span& &span class=&nf&&forward&/span&&span class=&p&&(&/span&&span class=&bp&&self&/span&&span class=&p&&,&/span& &span class=&n&&x&/span&&span class=&p&&):&/span&
&span class=&n&&features&/span& &span class=&o&&=&/span& &span class=&bp&&self&/span&&span class=&o&&.&/span&&span class=&n&&features&/span&&span class=&p&&(&/span&&span class=&n&&x&/span&&span class=&p&&)&/span&
&span class=&n&&out&/span& &span class=&o&&=&/span& &span class=&n&&F&/span&&span class=&o&&.&/span&&span class=&n&&avg_pool2d&/span&&span class=&p&&(&/span&&span class=&n&&features&/span&&span class=&p&&,&/span& &span class=&mi&&7&/span&&span class=&p&&,&/span& &span class=&n&&stride&/span&&span class=&o&&=&/span&&span class=&mi&&1&/span&&span class=&p&&)&/span&&span class=&o&&.&/span&&span class=&n&&view&/span&&span class=&p&&(&/span&&span class=&n&&features&/span&&span class=&o&&.&/span&&span class=&n&&size&/span&&span class=&p&&(&/span&&span class=&mi&&0&/span&&span class=&p&&),&/span& &span class=&o&&-&/span&&span class=&mi&&1&/span&&span class=&p&&)&/span&
&span class=&n&&out&/span& &span class=&o&&=&/span& &span class=&bp&&self&/span&&span class=&o&&.&/span&&span class=&n&&classifier&/span&&span class=&p&&(&/span&&span class=&n&&out&/span&&span class=&p&&)&/span&
&span class=&k&&return&/span& &span class=&n&&out&/span&
&/code&&/pre&&/div&&p&选择不同网络参数,就可以实现不同深度的DenseNet,这里实现DenseNet-121网络,而且Pytorch提供了预训练好的网络参数:&/p&&div class=&highlight&&&pre&&code class=&language-python&&&span&&/span&&span class=&k&&def&/span& &span class=&nf&&densenet121&/span&&span class=&p&&(&/span&&span class=&n&&pretrained&/span&&span class=&o&&=&/span&&span class=&bp&&False&/span&&span class=&p&&,&/span& &span class=&o&&**&/span&&span class=&n&&kwargs&/span&&span class=&p&&):&/span&
&span class=&sd&&&&&DenseNet121&&&&/span&
&span class=&n&&model&/span& &span class=&o&&=&/span& &span class=&n&&DenseNet&/span&&span class=&p&&(&/span&&span class=&n&&num_init_features&/span&&span class=&o&&=&/span&&span class=&mi&&64&/span&&span class=&p&&,&/span& &span class=&n&&growth_rate&/span&&span class=&o&&=&/span&&span class=&mi&&32&/span&&span class=&p&&,&/span& &span class=&n&&block_config&/span&&span class=&o&&=&/span&&span class=&p&&(&/span&&span class=&mi&&6&/span&&span class=&p&&,&/span& &span class=&mi&&12&/span&&span class=&p&&,&/span& &span class=&mi&&24&/span&&span class=&p&&,&/span& &span class=&mi&&16&/span&&span class=&p&&),&/span&
&span class=&o&&**&/span&&span class=&n&&kwargs&/span&&span class=&p&&)&/span&
&span class=&k&&if&/span& &span class=&n&&pretrained&/span&&span class=&p&&:&/span&
&span class=&c1&&# '.'s are no longer allowed in module names, but pervious _DenseLayer&/span&
&span class=&c1&&# has keys 'norm.1', 'relu.1', 'conv.1', 'norm.2', 'relu.2', 'conv.2'.&/span&
&span class=&c1&&# They are also in the checkpoints in model_urls. This pattern is used&/span&
&span class=&c1&&# to find such keys.&/span&
&span class=&n&&pattern&/span& &span class=&o&&=&/span& &span class=&n&&re&/span&&span class=&o&&.&/span&&span class=&n&&compile&/span&&span class=&p&&(&/span&
&span class=&s1&&r'^(.*denselayer\d+\.(?:norm|relu|conv))\.((?:[12])\.(?:weight|bias|running_mean|running_var))$'&/span&&span class=&p&&)&/span&
&span class=&n&&state_dict&/span& &span class=&o&&=&/span& &span class=&n&&model_zoo&/span&&span class=&o&&.&/span&&span class=&n&&load_url&/span&&span class=&p&&(&/span&&span class=&n&&model_urls&/span&&span class=&p&&[&/span&&span class=&s1&&'densenet121'&/span&&span class=&p&&])&/span&
&span class=&k&&for&/span& &span class=&n&&key&/span& &span class=&ow&&in&/span& &span class=&nb&&list&/span&&span class=&p&&(&/span&&span class=&n&&state_dict&/span&&span class=&o&&.&/span&&span class=&n&&keys&/span&&span class=&p&&()):&/span&
&span class=&n&&res&/span& &span class=&o&&=&/span& &span class=&n&&pattern&/span&&span class=&o&&.&/span&&span class=&n&&match&/span&&span class=&p&&(&/span&&span class=&n&&key&/span&&span class=&p&&)&/span&
&span class=&k&&if&/span& &span class=&n&&res&/span&&span class=&p&&:&/span&
&span class=&n&&new_key&/span& &span class=&o&&=&/span& &span class=&n&&res&/span&&span class=&o&&.&/span&&span class=&n&&group&/span&&span class=&p&&(&/span&&span class=&mi&&1&/span&&span class=&p&&)&/span& &span class=&o&&+&/span& &span class=&n&&res&/span&&span class=&o&&.&/span&&span class=&n&&group&/span&&span class=&p&&(&/span&&span class=&mi&&2&/span&&span class=&p&&)&/span&
&span class=&n&&state_dict&/span&&span class=&p&&[&/span&&span class=&n&&new_key&/span&&span class=&p&&]&/span& &span class=&o&&=&/span& &span class=&n&&state_dict&/span&&span class=&p&&[&/span&&span class=&n&&key&/span&&span class=&p&&]&/span&
&span class=&k&&del&/span& &span class=&n&&state_dict&/span&&span class=&p&&[&/span&&span class=&n&&key&/span&&span class=&p&&]&/span&
&span class=&n&&model&/span&&span class=&o&&.&/span&&span class=&n&&load_state_dict&/span&&span class=&p&&(&/span&&span class=&n&&state_dict&/span&&span class=&p&&)&/span&
&span class=&k&&return&/span& &span class=&n&&model&/span&
&/code&&/pre&&/div&&p&下面,我们使用预训练好的网络对图片进行测试,这里给出top-5预测值:&/p&&div class=&highlight&&&pre&&code class=&language-python&&&span&&/span&&span class=&n&&densenet&/span& &span class=&o&&=&/span& &span class=&n&&densenet121&/span&&span class=&p&&(&/span&&span class=&n&&pretrained&/span&&span class=&o&&=&/span&&span class=&bp&&True&/span&&span class=&p&&)&/span&
&span class=&n&&densenet&/span&&span class=&o&&.&/span&&span class=&n&&eval&/span&&span class=&p&&()&/span&
&span class=&n&&img&/span& &span class=&o&&=&/span& &span class=&n&&Image&/span&&span class=&o&&.&/span&&span class=&n&&open&/span&&span class=&p&&(&/span&&span class=&s2&&&./images/cat.jpg&&/span&&span class=&p&&)&/span&
&span class=&n&&trans_ops&/span& &span class=&o&&=&/span& &span class=&n&&transforms&/span&&span class=&o&&.&/span&&span class=&n&&Compose&/span&&span class=&p&&([&/span&
&span class=&n&&transforms&/span&&span class=&o&&.&/span&&span class=&n&&Resize&/span&&span class=&p&&(&/span&&span class=&mi&&256&/span&&span class=&p&&),&/span&
&span class=&n&&transforms&/span&&span class=&o&&.&/span&&span class=&n&&CenterCrop&/span&&span class=&p&&(&/span&&span class=&mi&&224&/span&&span class=&p&&),&/span&
&span class=&n&&transforms&/span&&span class=&o&&.&/span&&span class=&n&&ToTensor&/span&&span class=&p&&(),&/span&
&span class=&n&&transforms&/span&&span class=&o&&.&/span&&span class=&n&&Normalize&/span&&span class=&p&&(&/span&&span class=&n&&mean&/span&&span class=&o&&=&/span&&span class=&p&&[&/span&&span class=&mf&&0.485&/span&&span class=&p&&,&/span& &span class=&mf&&0.456&/span&&span class=&p&&,&/span& &span class=&mf&&0.406&/span&&span class=&p&&],&/span&
&span class=&n&&std&/span&&span class=&o&&=&/span&&span class=&p&&[&/span&&span class=&mf&&0.229&/span&&span class=&p&&,&/span& &span class=&mf&&0.224&/span&&span class=&p&&,&/span& &span class=&mf&&0.225&/span&&span class=&p&&])&/span&
&span class=&p&&])&/span&
&span class=&n&&images&/span& &span class=&o&&=&/span& &span class=&n&&trans_ops&/span&&span class=&p&&(&/span&&span class=&n&&img&/span&&span class=&p&&)&/span&&span class=&o&&.&/span&&span class=&n&&view&/span&&span class=&p&&(&/span&&span class=&o&&-&/span&&span class=&mi&&1&/span&&span class=&p&&,&/span& &span class=&mi&&3&/span&&span class=&p&&,&/span& &span class=&mi&&224&/span&&span class=&p&&,&/span& &span class=&mi&&224&/span&&span class=&p&&)&/span&
&span class=&n&&outputs&/span& &span class=&o&&=&/span& &span class=&n&&densenet&/span&&span class=&p&&(&/span&&span class=&n&&images&/span&&span class=&p&&)&/span&
&span class=&n&&_&/span&&span class=&p&&,&/span& &span class=&n&&predictions&/span& &span class=&o&&=&/span& &span class=&n&&outputs&/span&&span class=&o&&.&/span&&span class=&n&&topk&/span&&span class=&p&&(&/span&&span class=&mi&&5&/span&&span class=&p&&,&/span& &span class=&n&&dim&/span&&span class=&o&&=&/span&&span class=&mi&&1&/span&&span class=&p&&)&/span&
&span class=&n&&labels&/span& &span class=&o&&=&/span& &span class=&nb&&list&/span&&span class=&p&&(&/span&&span class=&nb&&map&/span&&span class=&p&&(&/span&&span class=&k&&lambda&/span& &span class=&n&&s&/span&&span class=&p&&:&/span& &span class=&n&&s&/span&&span class=&o&&.&/span&&span class=&n&&strip&/span&&span class=&p&&(),&/span& &span class=&nb&&open&/span&&span class=&p&&(&/span&&span class=&s2&&&./data/imagenet/synset_words.txt&&/span&&span class=&p&&)&/span&&span class=&o&&.&/span&&span class=&n&&readlines&/span&&span class=&p&&()))&/span&
&span class=&k&&for&/span& &span class=&n&&idx&/span& &span class=&ow&&in&/span& &span class=&n&&predictions&/span&&span class=&o&&.&/span&&span class=&n&&numpy&/span&&span class=&p&&()[&/span&&span class=&mi&&0&/span&&span class=&p&&]:&/span&
&span class=&k&&print&/span&&span class=&p&&(&/span&&span class=&s2&&&Predicted labels:&&/span&&span class=&p&&,&/span& &span class=&n&&labels&/span&&span class=&p&&[&/span&&span class=&n&&idx&/span&&span class=&p&&])&/span&
&/code&&/pre&&/div&&figure&&img src=&https://pic4.zhimg.com/v2-12ed56459aae6eed21210a0_b.jpg& data-caption=&& data-size=&normal& data-rawwidth=&480& data-rawheight=&360& class=&origin_image zh-lightbox-thumb& width=&480& data-original=&https://pic4.zhimg.com/v2-12ed56459aae6eed21210a0_r.jpg&&&/figure&&p&给出的预测结果为:&/p&&div class=&highlight&&&pre&&code class=&language-text&&&span&&/span&Predicted labels: n tiger cat
Predicted labels: n tabby, tabby cat
Predicted labels: n lynx, catamount
Predicted labels: n Egyptian cat
Predicted labels: n kit fox, Vulpes macrotis
&/code&&/pre&&/div&&p&&i&注:完整代码见&a href=&https://link.zhihu.com/?target=https%3A//github.com/xiaohu2015& class=& wrap external& target=&_blank& rel=&nofollow noreferrer&&xiaohu2015&/a&/&b&&a href=&https://link.zhihu.com/?target=https%3A//github.com/xiaohu2015/DeepLearning_tutorials& class=& wrap external& target=&_blank& rel=&nofollow noreferrer&&DeepLearning_tutorials&/a&。&/b&&/i&&/p&&h2&小结&/h2&&p&这篇文章详细介绍了DenseNet的设计理念以及网络结构,并给出了如何使用Pytorch来实现。值得注意的是,DenseNet在ResNet基础上前进了一步,相比ResNet具有一定的优势,但是其却并没有像ResNet那么出名(吃显存问题?深度不能太大?)。期待未来有更好的网络模型出现吧!&/p&&h2&参考文献&/h2&&ol&&li&&a href=&https://link.zhihu.com/?target=http%3A//www.cs.cornell.edu/%7Egaohuang/papers/DenseNet-CVPR-Slides.pdf& class=& wrap external& target=&_blank& rel=&nofollow noreferrer&&DenseNet-CVPR-Slides&/a&.&/li&&li&&a href=&https://link.zhihu.com/?target=https%3A//arxiv.org/abs/& class=& wrap external& target=&_blank& rel=&nofollow noreferrer&&Densely Connected Convolutional Networks&/a&.
&/li&&/ol&&hr&&p&&b&码字不易,欢迎给个赞!&/b&&/p&&p&&b&欢迎交流与转载,文章会同步发布在公众号:机器学习算法全栈工程师(Jeemy110)&/b&&/p&
码字不易,欢迎给个赞!欢迎交流与转载,文章会同步发布在公众号:机器学习算法全栈工程师(Jeemy110) 历史文章:前言在计算机视觉领域,卷积神经网络(CNN)已经成为最主流的方法,比如最近的GoogLenet,VGG-19,Incepe…
&figure&&img src=&https://pic4.zhimg.com/v2-b8d21a6ab5b3fddb17923c_b.jpg& data-rawwidth=&960& data-rawheight=&540& class=&origin_image zh-lightbox-thumb& width=&960& data-original=&https://pic4.zhimg.com/v2-b8d21a6ab5b3fddb17923c_r.jpg&&&/figure&&p&&b&码字不易,欢迎给个赞!&/b&&/p&&p&&b&欢迎交流与转载,文章会同步发布在公众号:机器学习算法全栈工程师(Jeemy110)&/b&&/p&&p&&br&&/p&&p&微信公众号链接:&/p&&a href=&https://link.zhihu.com/?target=https%3A//mp.weixin.qq.com/s%3F__biz%3DMzUyMjE2MTE0Mw%3D%3D%26mid%3D%26idx%3D2%26sn%3D7f6fddcf6cceafa92bf2%26chksm%3Df9d1578bcea6de9dd7700d87bab894bc1cc6d949e8f51fd826a1dd4b92f793f235%26mpshare%3D1%26scene%3D1%26srcid%3D0325cYKJAEkd5reuzT9befAy%23rd& data-draft-node=&block& data-draft-type=&link-card& data-image=&https://pic4.zhimg.com/v2-33a97c96d9e00ceaf02cf_ipico.jpg& data-image-width=&500& data-image-height=&500& class=& wrap external& target=&_blank& rel=&nofollow noreferrer&&重磅|基于深度学习的目标检测综述(一)&/a&&p&&br&&/p&&h2&概述&/h2&&p&图像分类,检测及分割是计算机视觉领域的三大任务。图像分类模型(详情见&a href=&https://link.zhihu.com/?target=https%3A//medium.com/comet-app/review-of-deep-learning-algorithms-for-image-classification-5fdbca4a05e2& class=& wrap external& target=&_blank& rel=&nofollow noreferrer&&这里&/a&)是将图像划分为单个类别,通常对应于图像中最突出的物体。但是现实世界的很多图片通常包含不只一个物体,此时如果使用图像分类模型为图像分配一个单一标签其实是非常粗糙的,并不准确。对于这样的情况,就需要目标检测模型,目标检测模型可以识别一张图片的多个物体,并可以定位出不同物体(给出边界框)。目标检测在很多场景有用,如无人驾驶和安防系统。&/p&&figure&&img src=&https://pic1.zhimg.com/v2-073b3d3da4dc2ee04affc36a90b5957a_b.jpg& data-size=&normal& data-rawwidth=&1600& data-rawheight=&665& class=&origin_image zh-lightbox-thumb& width=&1600& data-original=&https://pic1.zhimg.com/v2-073b3d3da4dc2ee04affc36a90b5957a_r.jpg&&&figcaption&图像分类,目标检测与实例分割的对比&/figcaption&&/figure&&p&目前主流的目标检测算法主要是基于深度学习模型,其可以分成两大类:(1)two-stage检测算法,其将检测问题划分为两个阶段,首先产生候选区域(region proposals),然后对候选区域分类(一般还需要对位置精修),这类算法的典型代表是基于region proposal的R-CNN系算法,如R-CNN,Fast R-CNN,Faster R-CNN等;(2)one-stage检测算法,其不需要region proposal阶段,直接产生物体的类别概率和位置坐标值,比较典型的算法如YOLO和SSD。目标检测模型的主要性能指标是检测准确度和速度,对于准确度,目标检测要考虑物体的定位准确性,而不单单是分类准确度。一般情况下,two-stage算法在准确度上有优势,而one-stage算法在速度上有优势。不过,随着研究的发展,两类算法都在两个方面做改进。Google在2017年开源了&a href=&https://link.zhihu.com/?target=https%3A//github.com/tensorflow/models/tree/master/research/object_detection& class=& wrap external& target=&_blank& rel=&nofollow noreferrer&&TensorFlow Object Detection API&/a&,并对主流的Faster R-CNN,R-FCN及SSD三个算法在MS COCO数据集上的性能做了细致对比(见&a href=&https://link.zhihu.com/?target=https%3A//arxiv.org/pdf/.pdf& class=& wrap external& target=&_blank& rel=&nofollow noreferrer&&Huang et al. 2017&/a&),如下图所示。近期,Facebook的FAIR也开源了基于Caffe2的目标检测平台&a href=&https://link.zhihu.com/?target=https%3A//github.com/facebookresearch/Detectron& class=& wrap external& target=&_blank& rel=&nofollow noreferrer&&Detectron&/a&,其实现了最新的Mask R-CNN,RetinaNet等检测算法,并且给出了这些算法的&a href=&https://link.zhihu.com/?target=https%3A//github.com/facebookresearch/Detectron/blob/master/MODEL_ZOO.md& class=& wrap external& target=&_blank& rel=&nofollow noreferrer&&Baseline Results&/a& 。不得不说,准确度(accuracy)和速度(speed)是一对矛盾体,如何更好地平衡它们一直是目标检测算法研究的一个重要方向。&/p&&figure&&img src=&https://pic3.zhimg.com/v2-3bcd15eae32fc_b.jpg& data-size=&normal& data-rawwidth=&1330& data-rawheight=&796& class=&origin_image zh-lightbox-thumb& width=&1330& data-original=&https://pic3.zhimg.com/v2-3bcd15eae32fc_r.jpg&&&figcaption&Faster R-CNN,R-FCN及SSD算法在MS COCO数据集上的性能对比&/figcaption&&/figure&&p&在这篇长文中,我们将对最新的目标检测算法做一个综述。在介绍目标检测算法之前,先简单介绍目标检测领域常用的数据集以及性能指标。&/p&&p&&br&&/p&&h2&数据集和性能指标&/h2&&p&目标检测常用的数据集包括PASCAL VOC,ImageNet,MS COCO等数据集,这些数据集用于研究者测试算法性能或者用于竞赛。目标检测的性能指标要考虑检测物体的位置以及预测类别的准确性,下面我们会说到一些常用的性能评估指标。&/p&&p&&br&&/p&&p&&b&数据集&/b&&/p&&p&PASCAL VOC(&a href=&https://link.zhihu.com/?target=http%3A//host.robots.ox.ac.uk/pascal/VOC/& class=& wrap external& target=&_blank& rel=&nofollow noreferrer&&The PASCAL Visual Object Classification&/a&)是目标检测,分类,分割等领域一个有名的数据集。从年,共举办了8个不同的挑战赛。PASCAL VOC包含约10,000张带有边界框的图片用于训练和验证。但是,PASCAL VOC数据集仅包含20个类别,因此其被看成目标检测问题的一个基准数据集。&/p&&p&&a href=&https://link.zhihu.com/?target=http%3A//www.image-net.org/& class=& wrap external& target=&_blank& rel=&nofollow noreferrer&&ImageNet&/a&在2013年放出了包含边界框的目标检测数据集。训练数据集包含500,000张图片,属于200类物体。由于数据集太大,训练所需计算量很大,因而很少使用。同时,由于类别数也比较多,目标检测的难度也相当大。2014 ImageNet数据集和2012 PASCAL VOC数据集的对比在&a href=&https://link.zhihu.com/?target=http%3A//image-net.org/challenges/LSVRC/2014/& class=& wrap external& target=&_blank& rel=&nofollow noreferrer&&这里&/a&。&/p&&p&另外一个有名的数据集是Microsoft公司(见&a href=&https://link.zhihu.com/?target=https%3A//arxiv.org/pdf/.pdf& class=& wrap external& target=&_blank& rel=&nofollow noreferrer&&T.-Y.Lin and al. 2015&/a&)建立的MS COCO(&a href=&https://link.zhihu.com/?target=http%3A//cocodataset.org/%23home& class=& wrap external& target=&_blank& rel=&nofollow noreferrer&&Common Objects in COntext&/a&)数据集。这个数据集用于多种竞赛:图像标题生成,目标检测,关键点检测和物体分割。对于目标检测任务,COCO共包含80个类别,每年大赛的训练和验证数据集包含超过120,000个图片,超过40,000个测试图片。测试集最近被划分为两类,一类是test-dev数据集用于研究者,一类是test-challenge数据集用于竞赛者。测试集的标签数据没有公开,以避免在测试集上过拟合。在&a href=&https://link.zhihu.com/?target=http%3A//cocodataset.org/%23detections-challenge2017& class=& wrap external& target=&_blank& rel=&nofollow noreferrer&&COCO 2017 Detection Challenge&/a&中,旷视科技团队凭借提出的&a href=&https://link.zhihu.com/?target=https%3A//arxiv.org/pdf/.pdf& class=& wrap external& target=&_blank& rel=&nofollow noreferrer&&Light-Head R-CNN&/a&模型夺得冠军(AP为0.526 ),看来还是two-stage算法准确度更胜一筹。&/p&&figure&&img src=&https://pic4.zhimg.com/v2-1966e0bde4bb18c2d1afc51_b.jpg& data-size=&normal& data-rawwidth=&1600& data-rawheight=&932& class=&origin_image zh-lightbox-thumb& width=&1600& data-original=&https://pic4.zhimg.com/v2-1966e0bde4bb18c2d1afc51_r.jpg&&&figcaption&2015 COCO数据集的分割实例. 来源: T.-Y.Lin and al. (2015)&/figcaption&&/figure&&figure&&img src=&https://pic2.zhimg.com/v2-839e3f87f30deefc1425_b.jpg& data-size=&normal& data-rawwidth=&568& data-rawheight=&267& class=&origin_image zh-lightbox-thumb& width=&568& data-original=&https://pic2.zhimg.com/v2-839e3f87f30deefc1425_r.jpg&&&figcaption&目标检测的主流数据集. 来源: https://tryolabs.com/blog/&/figcaption&&/figure&&p&&b&性能指标&/b&&/p&&p&目标检测问题同时是一个回归和分类问题。首先,为了评估定位精度,需要计算IoU(Intersection over Union,介于0到1之间),其表示预测框与真实框(ground-truth box)之间的重叠程度。IoU越高,预测框的位置越准确。因而,在评估预测框时,通常会设置一个IoU阈值(如0.5),只有当预测框与真实框的IoU值大于这个阈值时,该预测框才被认定为真阳性(True Positive, TP),反之就是假阳性(False Positive,FP)。&/p&&p&对于二分类,AP(Average Precision)是一个重要的指标,这是信息检索中的一个概}

我要回帖

更多关于 patpat官网 的文章

更多推荐

版权声明:文章内容来源于网络,版权归原作者所有,如有侵权请点击这里与我们联系,我们将及时删除。

点击添加站长微信