计算机视觉算法与应用的一些测试数据集和源码站点

2025-11-08

以下是computer vision：algorithm and application计算机视觉算法与应用这本书中附录里的关于计算机视觉的一些测试数据集和源码站点，我整理了下，加了点中文注解。

Computer Vision: Algorithms and Applications

Richard Szeliski

在本书的最好附录中，我总结了一些对学生，教授和研究者有用的附加材料。这本书的网址http://szeliski.org/Book包含了更新的数据集和软件，请同样访问他。 C.1 数据集

一个关键就是用富有挑战和典型的数据集来测试你算法的可靠性。当有背景或者他人的结果是可行的,这种测试可能甚至包含更多的信息(和质量更好)。经过这些年，大量的数据集已经被提出来用于测试和评估计算机视觉算法。许多这些数据集和软件被编入了计算机视觉的主页。一些更新的网址，像CVonline

(http://homepages.inf.ed.ac.uk/rbf/CVonline ), VisionBib.Com (http://datasets.visionbib.com/ ), and Computer Vision online (http://computervisiononline.com/ ), 有更多最新的数据集和软件。下面，我列出了一些用的最多的数据集，我将它们让章节排列以便它们联系更紧密。

第二章：图像信息

CUReT: Columbia-Utrecht 反射率和纹理数据库Re?ectance and Texture Database, http://www1.cs.columbia.edu/CAVE/software/curet/ (Dana, van Ginneken, Nayar et al. 1999).

Middlebury Color Datasets:不同摄像机拍摄的图像，注册后用于研究不同的摄像机怎么改变色域和彩色registered color images taken by different cameras to study how they transform gamuts and colors, http://vision.middlebury.edu/color/data/ Chakrabarti, Scharstein, and Zickler 2009).

第三章：图像处理

Middlebury test datasets for evaluating MRF minimization/inference algorithms评估隐马尔科夫随机场最小化和推断算法,

http://vision.middlebury.edu/MRF/results/ (Szeliski, Zabih, Scharstein et al. 2008).

第四章：特征检测和匹配

Af?ne Covariant Features database（反射协变的特征数据集） for evaluating feature detector and descriptor matching quality and repeatability（评估特征检测和描述匹配的质量和定位精度）, http://www.robots.ox.ac.uk/~vgg/research/affine/

(Miko-lajczyk and Schmid 2005; Mikolajczyk, Tuytelaars, Schmid et al. 2005).

Database of matched image patches for learning （图像斑块匹配学习数据库）and feature descriptor evaluation（特征描述评估数据库）,

http://cvlab.epfl.ch/~brown/patchdata/patchdata.html (Winder and Brown 2007; Hua,Brown, and Winder 2007).

第五章;分割

Berkeley Segmentation Dataset（分割数据库） and Benchmark of 1000 images labeled by 30 humans,（30个人标记的1000副基准图像）along with an evaluation,

http://www.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/segbench/ (Martin, Fowlkes, Tal et al. 2001).

Weizmann segmentation evaluation database of 100 grayscale images with ground truth segmentations,

http://www.wisdom.weizmann.ac.il/~vision/Seg Evaluation DB/index.html (Alpert, Galun, Basri et al. 2007).

第八章：稠密运动估计

The Middlebury optic ?ow evaluation（光流评估） Web site, http://vision.middlebury.edu/flow/data/ (Baker, Scharstein, Lewis et al. 2009).

The Human-Assisted Motion Annotation database,（人类辅助运动数据库）

http://people.csail.mit.edu/celiu/motionAnnotation/ (Liu, Freeman, Adelson et al. 2008)

第十章：计算机摄像学

High Dynamic Range radiance（辐射）maps, http://www.debevec.org/Research/HDR/ (De-bevec and Malik 1997).

Alpha matting evaluation Web site, http://alphamatting.com/ (Rhemann, Rother, Wang et al. 2009).

第十一章：Stereo correspondence立体对应

Middlebury Stereo Datasets and Evaluation, http://vision.middlebury.edu/stereo/ (Scharstein and Szeliski 2002).

Stereo Classi?cation（立体分类） and Performance Evaluation（性能评估） of different aggregation（聚类） costs for stereo matching（立体匹配）,

http://www.vision.deis.unibo.it/spe/SPEHome.aspx (Tombari, Mat- toccia, Di Stefano et al. 2008).

Middlebury Multi-View Stereo Datasets,

http://vision.middlebury.edu/mview/data/ (Seitz,Curless, Diebel et al. 2006).

Multi-view and Oxford Colleges building reconstructions, http://www.robots.ox.ac.uk/~vgg/data/data-mview.html .

Multi-View Stereo Datasets, http://cvlab.epfl.ch/data/strechamvs/ (Strecha, Fransens, and Van Gool 2006).

Multi-View Evaluation, http://cvlab.epfl.ch/~strecha/multiview/ (Strecha, von Hansen, Van Gool et al. 2008).

第十二章：3D重建

HumanEva: synchronized video（同步视频） and motion capture （动作捕捉）dataset for evaluation of articulated human motion, http://vision.cs.brown.edu/humaneva/ Sigal, Balan, and Black 2010).

第十三章：图像渲染

The (New) Stanford Light Field Archive, http://lightfield.stanford.edu/ (Wilburn, Joshi,Vaish et al. 2005).

Virtual Viewpoint Video: multi-viewpoint video with per-frame depth maps,

http://research.microsoft.com/en-us/um/redmond/groups/ivm/vvv/ (Zitnick, Kang, Uytten- daele et al. 2004).

第十四章：识别

查找一系列的视觉识别数据库，在表14.1–14.2.除了那些，这里还有：

Buffy pose classes, http://www.robots.ox.ac.uk/~vgg/data/ buffy pose classes/ and Buffy stickmen V2.1, http://www.robots.ox.ac.uk/~vgg/data/stickmen/index.html (Ferrari,Marin- Jimenez, and Zisserman 2009; Eichner and Ferrari 2009).

H3D database of pose/joint annotated photographs of humans,

http://www.eecs.berkeley.edu/~lbourdev/h3d/ (Bourdev and Malik 2009).

Action Recognition Datasets, http://www.cs.berkeley.edu/projects/vision/action, has point-

ers to several datasets for action and activity recognition, as well as some papers.（有一些关于人活动和运动的数据库和论文） The human action database at http://www.nada.kth.se/cvap/actions/ 包含更多的行动序列。

C.2 软件资源

一个对于计算机视觉算法最好的资源就是开源视觉图像库（opencv）

(http://opencv.willowgarage.com/wiki/),他有在intel的Gary Bradski和他的同事开发，现在由Willow Garage (Bradsky and Kaehler 2008)维护和扩展。一部分可利用的函数在http://opencv.willowgarage.com/documentation/cpp/中：图像处理和变换 (滤波，形态学，金字塔); 图像几何学的变换 (旋转，改变大小); 混合图像变换 (傅里叶变换，距离变换); 直方图;

分割 (分水岭, mean shift);

特征检测 (Canny, Harris, Hough, MSER, SURF); 运动分析和物体分析 (Lucas–Kanade, mean shift); 相机矫正和3D重建

机器学习 (k nearest neighbors, 支持向量机, 决策树, boost- ing, 随机树, expectation-maximization, 和神经网络).

Intel的Performance Primitives (IPP) library, http://software.intel.com/en-us/intel-ipp/，包含各种各样的图像处理任务的最佳优化代码，许多opencv中的例子利用了这个库，加入他安装了，程序运行得更快。依据功能，他和Opencv有很多相同的运算处理，并且加上了额外的库针对图像视频压缩，信号语音处理和矩阵代数。

MTALAB中的Image Processing Toolbox图像处理工具，

http://www.mathworks.com/products/image/，包含常规的处理，空域变换（旋转，改变大小），常规正交，图像分析和统计学（变边缘，哈弗变换），图像增强（自适应直方图均衡，中值滤波），图像恢复（去模糊），线性滤波（卷积），图像变换（傅里叶，离散余弦变换）和形态学操作（连通域和距离变换）

两个比较旧的库，它们没有被发展，但是包含了一些的有用的常规操作： VXL (C++ Libraries for Computer Vision Research and Implemen-tation, http://vxl.sourceforge.net/)

LTI-Lib 2 (http://www.ie.itcr.ac.cr/palvarado/ltilib-2/homepage/ ).

图像编辑和视图包，例如Windows Live Photo Gallery, iPhoto, Picasa,GIMP, 和 IrfanView，它们对执行这些处理非常有用：常规处理任务，格式转换，观测你的结果。它们同样可以用于对图像处理算法有趣的实现参考，例如色调调整和去噪。

这里他也有一些软件包和基础框架对你建一个实时视频处理的DEMOS很有用，Vision on Tap(http://www.visionontap.com/ )提供一个可以实时处理你的网络摄像头的网页服务(Chiu and Raskar 2009）。Video-Man (VideoManager, http://videomanlib.sourceforge.net/处理实时的基于视频的DEMOS和应用非常有用，你也可以用MATLAB中的imread直接从任何URl（例如网络摄像头）中读取视频。

下面，我列出了一些额外的网络资源，让章节排列以便它们看起来联系更紧密：

第三章:图像处理

matlabPyrTools—MATLAB 下的源码对于拉普拉斯变换，金字塔, QMF/小波, 和 steerable pyramids, http://www.cns.nyu.edu/~lcv/software.php (Simoncelli and Adel- son 1990a; Simoncelli, Freeman, Adelson et al. 1992).

BLS-GSM 图像去噪, http://decsai.ugr.es/~javier/denoise/ (Portilla, Strela,Wain- wright et al. 2003).

Fast bilateral ?ltering code（快速双边滤波）, http://people.csail.mit.edu/jiawen/#code (Chen, Paris, and Durand 2007).

C++ implementation of the fast distance transform algorithm,

http://people.cs.uchicago.edu/~pff/dt/ (Felzenszwalb and Huttenlocher 2004a).

GREYC’s Magic Image Converter, including image restoration software using regularization and anisotropic diffusion, http://gmic.sourceforge.net/gimp.shtml (Tschumperl′ e and Deriche 2005).

第四章：图像特征检测和匹配

VLFeat, 一个开放便捷的计算机视觉算法库 http://vlfeat.org/ (Vedaldi and Fulkerson 2008).

SiftGPU: A GPU Implementation of Scale Invariant Feature Transform (SIFT), GPU实现的尺度特征性变换

http://www.cs.unc.edu/~ccwu/siftgpu/ (Wu 2010).

SURF: Speeded Up Robust Features, http://www.vision.ee.ethz.ch/~surf/ (Bay, Tuyte-laars, and Van Gool 2006).

FAST corner detection, http://mi.eng.cam.ac.uk/~er258/work/fast.html (Rosten and Drum-mond 2005, 2006).

Linux binaries for af?ne region detectors and descriptors, as well as MATLAB ?les to compute repeatability and matching scores, http://www.robots.ox.ac.uk/~vgg/research/affine/

Kanade–Lucas–Tomasi feature trackers: KLT, http://www.ces.clemson.edu/~stb/klt/ (Shi and Tomasi 1994);

GPU-KLT, http://cs.unc.edu/~cmzach/opensource.html (Zach,Gallup, and Frahm 2008); Lucas–Kanade 20 Years On, http://www.ri.cmu.edu/projects/project 515.html (Baker and Matthews 2004). 第五章：分割

高效的基于图形的分割http://people.cs.uchicago.edu/~pff/segment (Felzenszwalb and Huttenlocher 2004b).

EDISON, 边缘检测和图像追踪,

http://coewww.rutgers.edu/riul/research/code/EDISON/ (Meer and Georgescu 2001; Comaniciu and Meer 2002).

Normalized cuts segmentation including intervening contours, http://www.cis.upenn.edu/~jshi/software/

(Shi and Malik 2000; Malik, Belongie, Leung et al. 2001).

Segmentation by weighted aggregation (SWA),利用加权集合的分割

http://www.cs.weizmann.ac.il/~vision/SWA (Alpert, Galun, Basri et al. 2007).

第六章：基于特征的对齐和校准

Non-iterative PnP algorithm,（非迭代PnP算法）

http://cvlab.ep?.ch/software/EPnP (Moreno-Noguer, Lep-etit, and Fua 2007).

共3页:

计算机视觉算法与应用的一些测试数据集和源码站点.doc 将本文的Word文档下载到电脑下载失败或者文档不完整，请联系客服人员解决！

下载这篇word文档