VCC

Visual Computing Research Center
Shenzhen University, China

Abstract: We present the UrbanBIS benchmark for large-scale 3D urban understanding, supporting practical urban-level semantic and building-level instance segmentation. UrbanBIS comprises six real urban scenes, with 2.5 billion points, covering a vast area of 10.78 square kilometers and 3,370 buildings, captured by 113,346 views of aerial photogrammetry. Particularly, UrbanBIS provides not only semantic-level annotations on a rich set of urban objects, including buildings, vehicles, vegetation, roads, and bridges, but also instance-level annotations on the buildings. Further, UrbanBIS is the first 3D dataset that introduces fine-grained building sub-categories, considering a wide variety of shapes for different building types. Besides, we propose B-Seg, a building instance segmentation method to establish UrbanBIS. B-Seg adopts an end-to-end framework with a simple yet effective strategy for handling large-scale point clouds. Compared with mainstream methods, B-Seg achieves better accuracy with faster inference speed on UrbanBIS. In addition to the carefully-annotated point clouds, UrbanBIS provides high-resolution aerial-acquisition photos and high-quality large-scale 3D reconstruction models, which shall facilitate a wide range of studies such as multi-view stereo, urban LOD generation, aerial path planning, autonomous navigation, road network extraction, and so on, thus serving as an important platform for many intelligent city applications.

Download

arXiv

paper

benchmark DATASET

B-SEG DATASET

code

supp

Declaration: UrbanBIS is publicly accessible for non-commercial uses only. Permission is granted to use the data only if you agree:
- The dataset is provided "AS IS". Despite our best efforts to assure accuracy, we disclaim all liability for any mistakes or omissions;
- All works that utilize this dataset including any partial use must cite our paper provided below;
- You refrain from disseminating this dataset or any altered variations;
- You are not permitted to utilize this dataset or any derivative work for any commercial endeavors;
- We reserve all rights that are not explicitly granted to you.

Privacy Concerns: We place great emphasis on ensuring the privacy and confidentiality of all data involved. Our practices align with the highest standards set by relevant laws and regulations. We have implemented robust measures to mitigate privacy concerns effectively. In the rare instance that you identify any privacy issues pertaining to your information within our dataset, please reach out to us promptly. We assure you that we will immediately remove the affected data upon receiving your request, prioritizing your privacy and confidentiality.

Data Format Description For UrbanBIS Dataset

Each line of the txt file represents a point contains the following information:
- X Y Z R G B Semantic_label Instance_label Fine-grained_building_category;
- Semantic_label = {'Terrain': 0, 'Vegetation': 1, 'Water': 2, 'Bridge': 3, 'Vehicle': 4, 'Boat': 5, 'Building': 6};
- Fine-grained_building_category = {'Commercial': 0, 'Residential': 1, 'Office': 2, 'Cultural': 3, 'Transportation': 4, 'Municipal': 5, 'Temporary': 6, 'Unclassified': 7}.

Choose the way of data downloading (Dropbox by default)

Baidu

Click images
for a larger view

Go to

Abstrct
Declaration
Benchmark
Algorithm
Application
Bibtex

BENCHMARK

Category	Qingdao	Wuhu	Longhua	Yuehai	Lihu	Yingrenshi
Labeled Point Cloud	26.5 GB	27.8 GB	29.1 GB	17.5 GB	11.5 GB	0.92 GB
Scene Point Cloud
Semantic Segmentation
Fine-grained Building Category
Building Instance Segmentation
Total#	594.06 M	625.08 M	653.90 M	393.37 M	255.12 M	22.22 M
Buliding	269.59 M	285.28 M	256.39 M	117.98 M	65.18 M	14.97 M
Ground	114.22M	133.32 M	158.62 M	69.60 M	80.54 M	4.39 M
Water	11.46 M	20.95 M	0.26 M	3.86 M	2.46 M	0
Boat	4.20M	409	852	0	2,490	0
Vegetation	179.50 M	175.69 M	225.50 M	197.83 M	104.09 M	1.66 M
Vehicle	15.05 M	8.24 M	11.35 M	1.16 M	2.08 M	0.85 M
Bridge	37,074	1.61 M	1.77 M	2.93 M	0.78 M	0.35 M
Images
Textured Meshes

Qingdao Wuhu Longhua Yuehai Lihu Yingrenshi

Scene Point Cloud

Semantic Segmentation

Fine-grained Building Category

Building Instance Segmentation

Total#	594.06 M
Buliding	269.59 M
Ground	114.22M
Water	11.46 M
Boat	4.20M
Vegetation	179.50 M
Vehicle	15.05 M
Bridge	37,074
Images
Textured Meshes

Scene Point Cloud

Semantic Segmentation

Fine-grained Building Category

Building Instance Segmentation

Total#	625.08 M
Buliding	285.28 M
Ground	133.32 M
Water	20.95 M
Boat	409
Vegetation	175.69 M
Vehicle	8.24 M
Bridge	1.61 M
Images
Textured Meshes

Scene Point Cloud

Semantic Segmentation

Fine-grained Building Category

Building Instance Segmentation

Total#	653.90 M
Buliding	256.39 M
Ground	158.62 M
Water	0.26 M
Boat	852
Vegetation	225.50 M
Vehicle	11.35 M
Bridge	1.77 M
Images
Textured Meshes

Scene Point Cloud

Semantic Segmentation

Fine-grained Building Category

Building Instance Segmentation

Total#	393.37 M
Buliding	117.98 M
Ground	69.60 M
Water	3.86 M
Boat	0
Vegetation	197.83 M
Vehicle	1.16 M
Bridge	2.93 M
Images
Textured Meshes

Scene Point Cloud

Semantic Segmentation

Fine-grained Building Category

Building Instance Segmentation

Total#	255.12 M
Buliding	65.18 M
Ground	80.54 M
Water	2.46 M
Boat	2,490
Vegetation	104.09 M
Vehicle	2.08 M
Bridge	0.78 M
Images
Textured Meshes

Scene Point Cloud

Semantic Segmentation

Fine-grained Building Category

Building Instance Segmentation

Total#	22.22 M
Buliding	14.97 M
Ground	4.39 M
Water	0
Boat	0
Vegetation	1.66 M
Vehicle	0.85 M
Bridge	0.35 M
Images
Textured Meshes

B-SEG

overview

results

Method	Qingdao				Wuhu				Longhua				Campus
	↑AP	↑AP50	↑AP25	↓T(s)	↑AP	↑AP50	↑AP25	↓T(s)	↑AP	↑AP50	↑AP25	↓T(s)	↑AP	↑AP50	↑AP25	↓T(s)
PointGroup [1]	0.364	0.512	0.578	9.80	0.502	0.662	0.748	5.90	0.318	0.443	0.556	5.73	0.117	0.235	0.455	3.65
HAIS [2]	0.320	0.465	0.506	7.11	0.383	0.616	0.711	3.62	0.159	0.249	0.350	3.17	0.002	0.012	0.146	3.26
SoftGroup [3]	0.383	0.446	0.487	6.55	0.536	0.649	0.721	3.61	0.151	0.199	0.300	3.06	0.253	0.364	0.439	2.16
DyCo3D [4]	0.285	0.376	0.498	5.20	0.470	0.620	0.732	3.04	0.020	0.045	0.196	1.77	0.029	0.063	0.180	1.67
DKNet [5]	0.383	0.434	0.474	2.15	0.474	0.575	0.650	1.20	0.077	0.154	0.253	1.78	0.044	0.109	0.251	0.88
B-Seg (Ours)	0.453	0.550	0.672	1.19	0.549	0.674	0.767	0.99	0.402	0.513	0.618	1.16	0.261	0.386	0.535	0.74

Qingdao Wuhu Longhua Campus

Method	↑AP	↑AP50	↑AP25	↓T(s)
PointGroup [1]	0.364	0.512	0.578	9.80
HAIS [2]	0.320	0.465	0.506	7.11
SoftGroup [3]	0.383	0.446	0.487	6.55
DyCo3D [4]	0.285	0.376	0.498	5.20
DKNet [5]	0.383	0.434	0.474	2.15
B-Seg (Ours)	0.453	0.550	0.672	1.19

Method	↑AP	↑AP50	↑AP25	↓T(s)
PointGroup [1]	0.502	0.662	0.748	5.90
HAIS [2]	0.383	0.616	0.711	3.62
SoftGroup [3]	0.536	0.649	0.721	3.61
DyCo3D [4]	0.470	0.620	0.732	3.04
DKNet [5]	0.474	0.575	0.650	1.20
B-Seg (Ours)	0.549	0.674	0.767	0.99

Method	↑AP	↑AP50	↑AP25	↓T(s)
PointGroup [1]	0.318	0.443	0.556	5.73
HAIS [2]	0.159	0.249	0.350	3.17
SoftGroup [3]	0.151	0.199	0.300	3.06
DyCo3D [4]	0.020	0.045	0.196	1.77
DKNet [5]	0.077	0.154	0.253	1.78
B-Seg (Ours)	0.402	0.513	0.618	1.16

Method	↑AP	↑AP50	↑AP25	↓T(s)
PointGroup [1]	0.117	0.235	0.455	3.65
HAIS [2]	0.002	0.012	0.146	3.26
SoftGroup [3]	0.253	0.364	0.439	2.16
DyCo3D [4]	0.029	0.063	0.180	1.67
DKNet [5]	0.044	0.109	0.251	0.88
B-Seg (Ours)	0.261	0.386	0.535	0.74

dataset

5.92GB

Training set

2.49GB

test set

APPLICATION

3D RECONSTRUCTION

POINT CLOUD

RECONSTRUCTED MESH

POINT CLOUD

RECONSTRUCTED MESH

LOD RECONSTRUCTION

Kinetic Shape Reconstruction

ORIGINAL MESH

KSR RESULT [6]

VITUAL SCENE DESIGN

DATASET PREVIEW