基于Unity 3D的裸眼立体播放器

来自CGTWiki
跳转至:导航搜索

主要功能

支持左右格式,上下格式,九宫格视频,2DZ视频的裸眼3D播放。

2DZ视频算法效果不如市场上的显示效果

与之配套的还一个测试光栅参数的unity 3D软件。即可以测试单个屏幕的参数,也可以测试多个屏幕的参数。

测试视频

左右格式视频

九宫格视频

横屏视频

竖屏视频

竖屏版本像素排布版本播放器 适用于avpro关键shader代码

2DZ视频


普通版本

普通版本立体播放器

test.xml 文件:Testxml.zip (记录显示器参数),放置于D盘下

视频放置于位于c:\video\目录下

左右格式的视频,文件名应包含lrv;九宫格格式的视频,文件名应包括9grid视频;彩色与深度的视频,文件名应包括2dz

人眼跟踪版本

人眼跟踪版本立体播放器

  • 视频放置于 d:\video

OpenFace数据集下载放置于:D:\models下

链接:https://pan.baidu.com/s/1WmIy_P8xB0lmGhbCAb7WwQ 
提取码:1mwt 
目录结构.png

按键说明:

F7   用于切换播放器和光栅参数测试场景

播放器场景:

s键:显示/隐藏文字指示
上下左右键:调整视点偏移值
F1-F6键:调整偏移值

光栅调试场景:
M键 : 显示当前参数
上、下、F1、F2键:调倾角
左、右、F3、F4键:调线数
F6 键 : 调子像素顺序,RGB,BGR
V,B键 : 调整视点偏移

UDP中控控制版本

UDP中控控制版本立体播放器

控制码

遵循UDP协议
接收端可自动捕获本机ip地址和端口号
发送端,发送控制指令至8888端口,实现播放器控制
发送0E 视频播放
发送14 视频暂停
发送 10 下一首
发送 0C 上一首
发送 0A 音量增
发送 0D 音量减

使用说明与播放逻辑


使用说明
文件存放方式及命名规则
文件存放方式:
.exe文件所在文件夹目录为
Exe所在文件夹目录.png
在video文件夹下建两个文件夹分别为video文件夹和logo文件夹
video文件夹中存放所有播放视频
logo文件夹中存放待机视频
Video文件夹目录.png
设备对应的test.xml文件存放在.exe所在文件夹
命名规则:
待机视频命名规则:以0_开始
其他视频命名规则:避免以0开头
Video下命名规则.png
保证每次开始播放时循环播放待机视频

操作流程
双击3DPlayer.exe文件,开始播放3D视频
首次使用时,双击3DPlayer.exe文件后按住ALT键,弹出分辨率设置窗口,设置完成后,再次打开时,无需更改分辨率,直接按先前设置的分辨率格式播放
使用UDP协议发送中控命令
接收端可自动捕获本机ip地址和端口号,接收端接收时注意关闭防火墙
发送端 发送控制指令至接收端8888端口



播放逻辑
开始播放时,循环播放logo视频,直到收到播放上下一首的控制命令;
播放主视频时,每次主视频播放完成后,循环播放logo视频直到收到播放上下首的控制命令;
播放中,可以通过UDP发送命令控制视频播放,停止,音量增加,音量降低。

第三方解码库

为了能够播放所有的视频,请安装LavFilter

DIBR算法

郭南算法

// 使用atomicMAX 合并内存访问,实现快速顺序计算
#include <cuda.h>
#include "cuda_runtime.h"
#include <cuda_runtime_api.h>
#include "device_functions.h"
#include <helper_cuda.h>
#include <helper_math.h>
#include <helper_functions.h>
#include "device_launch_parameters.h"
#include <cuda_d3d10_interop.h>

#include <iostream>
#include <time.h>

typedef unsigned char uchar;
typedef unsigned int uint;
#define CONST_VIEW 1
#define KERNEL_FREQUENCY 1202500//GPU时钟频率
#define blockSize_x 64
#define blockSize_y 4

#define COLOR_DIFF_UINT(x, i, j) (__usad4(x[i], x[j]))

using namespace std;

__constant__ float d_range_table[256];//ip filter
__constant__ float cGaussian[64];//bilateral filter

float upsample_x = 2;
float upsample_y = 1;

int downsample_x = 2;
int downsample_y = 4;

float multiplex, multiple_x;
float multipley, multiple_y;

int frame_size_in_3c;
int frame_size_in_4c;
int frame_size_in_drgb;
int frame_size_in_1c;

int imgw, imgh, img_w, img_h, out_w, out_h;
int framesize_out_3c;
int framesize_out_4c;

int BLOCK_NUM_init;

dim3 block_init;
dim3 grid_init;
dim3 block_multiview;
dim3 grid_multiview;
dim3 block_adapt;
dim3 grid_adapt;
dim3 block_filter;
dim3 grid_filter;

dim3 block_mix;
dim3 grid_mix;

dim3 block_test;
dim3 grid_test;

size_t pitch;        // display     

// src
unsigned int *dImgL_src = NULL;
unsigned int *dImgR_src = NULL;
unsigned int *dImgL_turn = NULL;

unsigned int *dImgL_down = NULL;
unsigned int *dImgR_down = NULL;

uchar *dImgD_label = NULL;

// dst
unsigned int *dImgL = NULL; // contain depth information and changing size
int *dImgD = NULL; 
int *dImgD2 = NULL; 
uchar* d_Depth_in = NULL;//depth image 
uchar* d_Depth_view0 = NULL;
unsigned char *d_depth_scale = NULL;
unsigned char *d_depth_scale2 = NULL;

unsigned int *dImage = NULL;   //original image
unsigned int *dView = NULL;   //temp array for iterations

unsigned char *d_frame_3c = NULL;
unsigned char *d_frame_4c = NULL;
unsigned char *d_frame_rgbd = NULL;
unsigned int *d_frame_drgb = NULL;
unsigned char *d_frame_depth = NULL;
unsigned char *d_frame_depth_4c = NULL;

unsigned char *d_filter_BGR = NULL;
unsigned char *d_filter_RGBA = NULL;
unsigned char *d_filter_ARGB = NULL;
unsigned char *d_filter_BGRA = NULL;

float *d_filter_out = NULL;
unsigned char  *out_frame_3c = NULL;
unsigned char  *out_frame_4c = NULL;

int iDivUp(int a, int b)
{
	return ((a % b) != 0) ? (a / b + 1) : (a / b);
}

__device__ uchar4 uintTouchar4(unsigned int data)
{
	uchar4 outdata;
	outdata.x = (data & 16711680) >> 16;
	outdata.y = (data & 65280) >> 8;
	outdata.z = (data & 255);
	return outdata;
}

__device__ unsigned int __usad4(unsigned int A, unsigned int B, unsigned int C = 0)
{
	unsigned int result;
#if (__CUDA_ARCH__ >= 300) // Kepler (SM 3.x) supports a 4 vector SAD SIMD
	asm("vabsdiff4.u32.u32.u32.add" " %0, %1, %2, %3;": "=r"(result) : "r"(A), "r"(B), "r"(C));
#else // SM 2.0            // Fermi  (SM 2.x) supports only 1 SAD SIMD, so there are 4 instructions
	asm("vabsdiff.u32.u32.u32.add" " %0, %1.b0, %2.b0, %3;": "=r"(result) : "r"(A), "r"(B), "r"(C));
	asm("vabsdiff.u32.u32.u32.add" " %0, %1.b1, %2.b1, %3;": "=r"(result) : "r"(A), "r"(B), "r"(result));
	asm("vabsdiff.u32.u32.u32.add" " %0, %1.b2, %2.b2, %3;": "=r"(result) : "r"(A), "r"(B), "r"(result));
	asm("vabsdiff.u32.u32.u32.add" " %0, %1.b3, %2.b3, %3;": "=r"(result) : "r"(A), "r"(B), "r"(result));
#endif
	return result;
}

__device__ int computeSimilarity(unsigned int p, unsigned int q)
{
	int r = (p & 0X00FF0000) >> 16;
	int g = (p & 0X0000FF00) >> 8;
	int b = (p & 0X000000FF);
	int rq = (q & 0X00FF0000) >> 16;
	int gq = (q & 0X0000FF00) >> 8;
	int bq = (q & 0X000000FF);

	return (abs(r - rq) + abs(g - gq) + abs(b - bq));
}

__device__ int computeSimilarity_max(unsigned int p, unsigned int q)
{
	int r = (p & 0X00FF0000) >> 16;
	int g = (p & 0X0000FF00) >> 8;
	int b = (p & 0X000000FF);
	int rq = (q & 0X00FF0000) >> 16;
	int gq = (q & 0X0000FF00) >> 8;
	int bq = (q & 0X000000FF);

	return max(abs(r - rq), max(abs(g - gq), abs(b - bq)));
}

// bilateral filter test
__device__ float euclideanLen(uint a, uint b, float d)
{
	uint ar = a & 0x000000ff;
	uint ag = (a & 0x0000ff00) >> 8;
	uint ab = (a & 0x00ff0000) >> 16;
	uint br = b & 0x000000ff;
	uint bg = (b & 0x0000ff00) >> 8;
	uint bb = (b & 0x00ff0000) >> 16;

	float mod = (br - ar) * (br - ar) +
		(bg - ag) * (bg - ag) +
		(bb - ab) * (bb - ab);

	return exp(-mod / (2.f * d * d));
}


//-----change format------//
__global__ void
changeFilter_RGBAtoBGR(unsigned char *indata, unsigned char *outdata, int w, int h)
{
	int x = blockDim.x*blockIdx.x + threadIdx.x;
	int y = blockDim.y*blockIdx.y + threadIdx.y;

	if (x < w&&y < h)
	{
		int coordinator_in = y*w * 4 + x * 4;
		int coordinator_out = y*w * 3 + x * 3;

		unsigned char r = indata[coordinator_in];
		unsigned char g = indata[coordinator_in + 1];
		unsigned char b = indata[coordinator_in + 2];

		outdata[coordinator_out] = b;
		outdata[coordinator_out + 1] = g;
		outdata[coordinator_out + 2] = r;
	}
}

__global__ void
changeFilter_ARGBtoBGR(unsigned char *indata, unsigned char *outdata, int w, int h)
{
	int x = blockDim.x*blockIdx.x + threadIdx.x;
	int y = blockDim.y*blockIdx.y + threadIdx.y;

	if (x < w&&y < h)
	{
		int coordinator_in = y*w * 4 + x * 4;
		int coordinator_out = y*w * 3 + x * 3;

		unsigned char r = indata[coordinator_in + 1];
		unsigned char g = indata[coordinator_in + 2];
		unsigned char b = indata[coordinator_in + 3];

		outdata[coordinator_out] = b;
		outdata[coordinator_out + 1] = g;
		outdata[coordinator_out + 2] = r;
	}
}

__global__ void
changeFilter_BGRAtoBGR(unsigned char *indata, unsigned char *outdata, int w, int h)
{
	int x = blockDim.x*blockIdx.x + threadIdx.x;
	int y = blockDim.y*blockIdx.y + threadIdx.y;

	if (x < w&&y < h)
	{
		int coordinator_in = y*w * 4 + x * 4;
		int coordinator_out = y*w * 3 + x * 3;

		unsigned char b = indata[coordinator_in];
		unsigned char g = indata[coordinator_in + 1];
		unsigned char r = indata[coordinator_in + 2];

		outdata[coordinator_out] = b;
		outdata[coordinator_out + 1] = g;
		outdata[coordinator_out + 2] = r;
	}
}

__global__ void
changeFilter_BGRtoABGR(void *indata, void*outdata, int w, int h)
{
	int x = blockDim.x*blockIdx.x + threadIdx.x;
	int y = blockDim.y*blockIdx.y + threadIdx.y;
	if (x < w&&y < h)
	{
		int coordinator_out = y*w + x;
		int coordinator_in = coordinator_out * 3;
		uint r = ((uchar*)indata)[coordinator_in ];
		uint g = ((uchar*)indata)[coordinator_in + 1];
		uint b = ((uchar*)indata)[coordinator_in + 2];
		((uint*)outdata)[coordinator_out] = r | (g << 8) | (b << 16);
	}
}

//输入格式
__global__ void
changeCV_FRAMEtoDRGB(unsigned char *g_indata, unsigned int *g_odata, int w, int h)
{
	int y = blockDim.y * blockIdx.y + threadIdx.y;
	int x = blockDim.x * blockIdx.x + threadIdx.x;

	if (y<h&&x<w)
	{
		int coordinator = y*(w * 2) * 3 + 3 * x;//保证coordinator能取到图像左边的像素
		unsigned int b = g_indata[coordinator];
		unsigned int g = g_indata[coordinator + 1];
		unsigned int r = g_indata[coordinator + 2];

		unsigned int d = g_indata[coordinator + w * 3];//取图像右边的像素,即左边像素对应的深度信息(灰度值)。
		if (d == 0) d ==1;

		g_odata[y*w + x] = (d << 24) | (r << 16) | (g << 8) | b;
	}
}

__global__ void
changeCV_FRAMEtoDDDD(unsigned char *g_indata, unsigned int *g_odata, int w, int h)
{
	//观看输出的深度图
	int y = blockDim.y * blockIdx.y + threadIdx.y;
	int x = blockDim.x * blockIdx.x + threadIdx.x;

	if (y<h&&x<w)
	{
		int coordinator = y*(w * 2) * 3 + 3 * x;
		unsigned int d = g_indata[coordinator + w * 3];
		g_odata[y*w + x] = (d << 24) | (d << 16) | (d << 8) | d;
	}
}

__global__ void
changeCV_FRAMEtoD(unsigned char *g_indata, unsigned char *g_odata, int w, int h)
{
	//观看输出的深度图
	int y = blockDim.y * blockIdx.y + threadIdx.y;
	int x = blockDim.x * blockIdx.x + threadIdx.x;

	if (y<h&&x<w)
	{
		unsigned char d = g_indata[y*(w * 2) * 3 + 3 * x + w * 3];
		g_odata[y*w + x] = d;
	}
}

//输出格式
__global__ void
OutputFormat_BGR(unsigned int *g_indata, unsigned char *g_odata, int w, int h)
{
	int y = blockDim.y * blockIdx.y + threadIdx.y;
	int x = blockDim.x * blockIdx.x + threadIdx.x;
	if (y<h&&x<w)
	{
		int coordinator = y*w + x;
		int out_coordinator = y * w * 3 + 3 * x;
		unsigned int r = (g_indata[coordinator] >> 16) & 0x000000FF;
		unsigned int g = (g_indata[coordinator] >> 8) & 0x000000FF;
		unsigned int b = (g_indata[coordinator]) & 0x000000FF;

		g_odata[out_coordinator] = (unsigned char)b;//b
		g_odata[out_coordinator + 1] = (unsigned char)g;//g 
		g_odata[out_coordinator + 2] = (unsigned char)r;//r
		//g_odata[out_coordinator + 3] = 1;//a
	}
}

__global__ void
OutputFormat_Disparity(unsigned int *g_indata, unsigned char *g_odata, int w, int h)
{
	int y = blockDim.y * blockIdx.y + threadIdx.y;
	int x = blockDim.x * blockIdx.x + threadIdx.x;
	if (y<h&&x<w)
	{
		int coordinator = y*w + x;
		int out_coordinator = y * w * 3 + 3 * x;
		unsigned int r = (g_indata[coordinator] >> 24) & 0x000000FF;
		unsigned int g = r;
		unsigned int b = r;

		g_odata[out_coordinator] = (unsigned char)b;//b
		g_odata[out_coordinator + 1] = (unsigned char)g;//g 
		g_odata[out_coordinator + 2] = (unsigned char)r;//r
		//g_odata[out_coordinator + 3] = 1;//a
	}
}
//-----change format------//

//-----init buffer--------//
__global__ void
initRGBA(void *g_indata, int w, int h)
{
	int y = blockDim.y * blockIdx.y + threadIdx.y;
	int x = blockDim.x * blockIdx.x + threadIdx.x;

	if (y<h && x<w)
		((int*)g_indata)[y*w + x] = 0;
}
__global__ void
initRGBA3(int *g_indata,int init_value, int w, int h)
{
	int y = blockDim.y * blockIdx.y + threadIdx.y;
	int x = blockDim.x * blockIdx.x + threadIdx.x;

	if (y < h && x < w)
	{
		int coordinate = y*w * 3 + 3 * x;
		g_indata[coordinate] = init_value;
		g_indata[coordinate + 1] = init_value;
		g_indata[coordinate + 2] = init_value;
	}
}

__global__ void
initD3(unsigned char *g_indata, int init_value, int w, int h)
{
	int y = blockDim.y * blockIdx.y + threadIdx.y;
	int x = blockDim.x * blockIdx.x + threadIdx.x;

	if (y < h && x < w)
	{
		int coordinate = y*w * 3 + 3 * x;
		g_indata[coordinate] = init_value;
		g_indata[coordinate + 1] = init_value;
		g_indata[coordinate + 2] = init_value;
	}
}
//-----init buffer--------//

//-----change size-------//
__global__ void
upSampleImage(unsigned int * imgIn, unsigned int* imgOut, int h, int w, float upsample_x, float upsample_y)
{
	unsigned int y = blockDim.y * blockIdx.y + threadIdx.y;
	unsigned int x = blockDim.x * blockIdx.x + threadIdx.x;

	unsigned int iny = unsigned int(y / upsample_y); //>>
	unsigned int inx = unsigned int(x / upsample_x);

	unsigned int in_w = unsigned int(w / upsample_x);//>>

	if (x < w && y < h)
	{
		unsigned int indata = imgIn[iny*in_w + inx];
		imgOut[w * y + x] = indata;
	}
}

__global__ void
upSampleDepth(unsigned char * imgIn, unsigned char* imgOut, int h, int w, float upsample_x, float upsample_y)
{
	unsigned int y = blockDim.y * blockIdx.y + threadIdx.y;
	unsigned int x = blockDim.x * blockIdx.x + threadIdx.x;

	unsigned int iny = unsigned int(y / upsample_y); 
	unsigned int inx = unsigned int(x / upsample_x);
	unsigned int in_w = unsigned int(w / upsample_x);

	if (x < w && y < h)
	{
		unsigned char indata = imgIn[iny*in_w + inx];
		imgOut[w * y + x] = indata;
	}
}
//-----change size-------//


//----- mapping and warping ---//

__global__ void
mapping_warpingAllView_disparity(unsigned char *g_indata, unsigned char *g_filter, int *g_odata, int *g_odata2,
unsigned char *g_scale, unsigned char *g_scale2,
int w, int h, int num_view,float scale_disparity, int positiveDisparity, bool isTest)
{
	int y = blockDim.y * blockIdx.y + threadIdx.y;
	int x = blockDim.x * blockIdx.x + threadIdx.x;

	if (y<h && x<w)
	{
		int coordinator_y = y*w;	

		int outx = 0;
		int coordinator_out;// for filter( 3 channel )		

		unsigned char indata = g_indata[coordinator_y+x];
		int indata_d = indata + positiveDisparity;

		int ind, i;
		unsigned int a, b, c;


		unsigned int viewid_demarcationL, viewid_demarcationM, viewid_demarcationR;
		unsigned int int_view_element = 256;

		int out_data = x;
		for (int viewid = 1; viewid <= num_view; viewid++)
		{
			i = viewid - 1;
			viewid_demarcationM = viewid << 8;
			viewid_demarcationL = (viewid - 1) << 8;
			viewid_demarcationR = (viewid + 1) << 8;

			ind = int((indata_d*i) * scale_disparity);
			outx = x - ind;

			out_data = ind;
			if (outx<w && outx >= 0)
			{
				coordinator_out = 3 * (coordinator_y + outx);

				if (!isTest)
				{
					a = unsigned int(g_filter[coordinator_out]);//B
					b = unsigned int(g_filter[coordinator_out + 1]);//G
					c = unsigned int(g_filter[coordinator_out + 2]);//R 
					c = 255 - c;
					b = 255 - b;
					a = 255 - a;
					c *= num_view;
					b *= num_view;
					a *= num_view;
				}
				else
				{
					c = 255 * num_view;
					b = c;
					a = c;
				}

				if (viewid == num_view)
				{
					if (a < int_view_element)
					{
						atomicMax(&g_odata[coordinator_out], out_data);
						g_scale[coordinator_out] = int_view_element - a;
					}
					else if (viewid_demarcationL < a && a <= viewid_demarcationM)
					{
						atomicMax(&g_odata2[coordinator_out], out_data);
						g_scale2[coordinator_out] = a - viewid_demarcationL;
					}
					if (b < int_view_element)
					{
						atomicMax(&g_odata[coordinator_out + 1], out_data);
						g_scale[coordinator_out+1] = int_view_element - b;
					}
					else if (viewid_demarcationL < b && b <= viewid_demarcationM)
					{
						atomicMax(&g_odata2[coordinator_out + 1], out_data);
						g_scale2[coordinator_out+1] = b - viewid_demarcationL;
					}
					if (c < int_view_element)
					{
						atomicMax(&g_odata[coordinator_out + 2], out_data);
						g_scale[coordinator_out+2] = int_view_element - c;
					}
					else if (viewid_demarcationL < c && c <= viewid_demarcationM)
					{
						atomicMax(&g_odata2[coordinator_out + 2], out_data);
						g_scale2[coordinator_out + 2] =  c - viewid_demarcationL;
					}
				}
				else
				{
					if (viewid_demarcationM < a && a <= viewid_demarcationR)
					{
						atomicMax(&g_odata[coordinator_out], out_data);
						g_scale[coordinator_out] = viewid_demarcationR - a;
					}
					else if (viewid_demarcationL < a && a <= viewid_demarcationM)
					{
						atomicMax(&g_odata2[coordinator_out], out_data);
						g_scale2[coordinator_out] = a - viewid_demarcationL;
					}
					if (viewid_demarcationM < b && b <= viewid_demarcationR)
					{
						atomicMax(&g_odata[coordinator_out + 1], out_data);
						g_scale[coordinator_out+1] = viewid_demarcationR - b;
					}
					else if (viewid_demarcationL < b && b <= viewid_demarcationM)
					{
						atomicMax(&g_odata2[coordinator_out + 1], out_data);
						g_scale2[coordinator_out+1] = b - viewid_demarcationL;
					}
					if (viewid_demarcationM < c && c <= viewid_demarcationR)
					{
						atomicMax(&g_odata[coordinator_out + 2], out_data);
						g_scale[coordinator_out+2] = viewid_demarcationR - c;
					}
					else if (viewid_demarcationL < c && c <= viewid_demarcationM)
					{
						atomicMax(&g_odata2[coordinator_out + 2], out_data);
						g_scale2[coordinator_out+2] = c - viewid_demarcationL;
					}
				}
			}
		}
	}
}

//occlusion
__global__ void
occlusion_basedGradient_disparity(unsigned char *g_indata, unsigned char *g_filter, int *g_odata, int *g_odata2, 
unsigned char *g_scale, unsigned char *g_scale2,
int w, int h, int num_view, float scale_disparity, int positiveDisparity, bool isTest)
{
	int y = blockDim.y * blockIdx.y + threadIdx.y;
	int x = blockDim.x * blockIdx.x + threadIdx.x;
	if (x >= w && y >= h) return;
	int coordinator_y = y*w;
	int coordinator = coordinator_y + x;

	unsigned char indata = g_indata[coordinator];//深度
	int indata_d = indata + positiveDisparity;
	//if (x<w-1)
	{
		unsigned char indataPlus = 0;
		if (x<w - 1)
			indataPlus = g_indata[coordinator + 1];	
		int indata_dPlus = indataPlus + positiveDisparity;

		int d_grad = indata_dPlus - indata_d;// compute gradient

		if (d_grad < 0)// hole空洞
		{
			int i;
			int ind, indPlus;
			int coordinator_out;
			int outx;

			int xHole_l, xHole_r;
			int out_data;
			unsigned int viewid_demarcationL, viewid_demarcationM, viewid_demarcationR;
			unsigned int int_view_element = 256;

			unsigned int a, b, c;

			for (int viewid = 1; viewid <= num_view; viewid++)//多视点循环
			{
				i = viewid - 1;
				viewid_demarcationM = viewid << 8;
				viewid_demarcationL = (viewid - 1) << 8;
				viewid_demarcationR = (viewid + 1) << 8;
				ind = int((indata_d*i) * scale_disparity);
				indPlus = int((indata_dPlus*i) * scale_disparity);
				xHole_l = x + 1 - ind;
				xHole_l = max(xHole_l, 0);
				xHole_r = x + 1 - indPlus;
				xHole_r = min(xHole_r, w - 1);

				out_data = ind;

				for (outx = xHole_l; outx<xHole_r; outx++)//循环,空洞填充
				{
					out_data = ind--;//计算伪视差,对伪视差作用不明显

					coordinator_out = 3 * (coordinator_y + outx);
					if (!isTest)
					{
						a = unsigned int(g_filter[coordinator_out]);//B
						b = unsigned int(g_filter[coordinator_out + 1]);//G
						c = unsigned int(g_filter[coordinator_out + 2]);//R 
						c = 255 - c;
						b = 255 - b;
						a = 255 - a;
						c *= num_view;
						b *= num_view;
						a *= num_view;
					}
					else
					{
						c = 255 * num_view;
						b = c;
						a = c;
					}
					if (viewid == num_view)
					{
						if (a < int_view_element)
						{
							atomicMax(&g_odata[coordinator_out], out_data);
							g_scale[coordinator_out] = int_view_element - a;
						}
						else if (viewid_demarcationL < a && a <= viewid_demarcationM)
						{
							atomicMax(&g_odata2[coordinator_out], out_data);
							g_scale2[coordinator_out] = a - viewid_demarcationL;
						}
						if (b < int_view_element)
						{
							atomicMax(&g_odata[coordinator_out + 1], out_data);
							g_scale[coordinator_out + 1] = int_view_element - b;
						}
						else if (viewid_demarcationL < b && b <= viewid_demarcationM)
						{
							atomicMax(&g_odata2[coordinator_out + 1], out_data);
							g_scale2[coordinator_out + 1] = b - viewid_demarcationL;
						}
						if (c < int_view_element)
						{
							atomicMax(&g_odata[coordinator_out + 2], out_data);
							g_scale[coordinator_out + 2] = int_view_element - c;
						}
						else if (viewid_demarcationL < c && c <= viewid_demarcationM)
						{
							atomicMax(&g_odata2[coordinator_out + 2], out_data);
							g_scale2[coordinator_out + 2] = c - viewid_demarcationL;
						}
					}
					else
					{
						if (viewid_demarcationM < a && a <= viewid_demarcationR)
						{
							atomicMax(&g_odata[coordinator_out], out_data);
							g_scale[coordinator_out] = viewid_demarcationR - a;
						}
						else if (viewid_demarcationL < a && a <= viewid_demarcationM)
						{
							atomicMax(&g_odata2[coordinator_out], out_data);
							g_scale2[coordinator_out] = a - viewid_demarcationL;
						}
						if (viewid_demarcationM < b && b <= viewid_demarcationR)
						{
							atomicMax(&g_odata[coordinator_out + 1], out_data);
							g_scale[coordinator_out + 1] = viewid_demarcationR - b;
						}
						else if (viewid_demarcationL < b && b <= viewid_demarcationM)
						{
							atomicMax(&g_odata2[coordinator_out + 1], out_data);
							g_scale2[coordinator_out + 1] = b - viewid_demarcationL;
						}
						if (viewid_demarcationM < c && c <= viewid_demarcationR)
						{
							atomicMax(&g_odata[coordinator_out + 2], out_data);
							g_scale[coordinator_out + 2] = viewid_demarcationR - c;
						}
						else if (viewid_demarcationL < c && c <= viewid_demarcationM)
						{
							atomicMax(&g_odata2[coordinator_out + 2], out_data);
							g_scale2[coordinator_out + 2] = c - viewid_demarcationL;
						}
					}
				}
			}
		}
		if(x == 0 && indata_d<0)//左侧边界的问题,是空洞填充问题
		{
			int i;
			int ind;
			int coordinator_out;
			int outx;
			int xHole_l, xHole_r;
			unsigned int a, b, c;
			unsigned int viewid_demarcationL, viewid_demarcationM, viewid_demarcationR;
			unsigned int int_view_element = 256;

			for (int viewid = 1; viewid <= num_view; viewid++)
			{
				i = viewid - 1;
				viewid_demarcationM = viewid << 8;
				viewid_demarcationL = (viewid - 1) << 8;
				viewid_demarcationR = (viewid + 1) << 8;
				ind = int((indata_d*i)* scale_disparity);

				xHole_l = 0;
				xHole_r = 0 - ind;

				int out_data = ind;
				for (outx = xHole_r - 1; outx >= xHole_l; outx--)
				{
					out_data += 2;

					coordinator_out = 3 * (coordinator_y + outx);
					if (!isTest)
					{
						a = unsigned int(g_filter[coordinator_out]);
						b = unsigned int(g_filter[coordinator_out + 1]);
						c = unsigned int(g_filter[coordinator_out + 2]);
						c = 255 - c;
						b = 255 - b;
						a = 255 - a;
						c *= num_view;
						b *= num_view;
						a *= num_view;
					}
					else
					{
						c = 255 * num_view;
						b = c;
						a = c;
					}
					if (viewid == num_view)
					{
						if (a < int_view_element)
						{
							atomicMax(&g_odata[coordinator_out], out_data);
							g_scale[coordinator_out] = int_view_element - a;
						}
						else if (viewid_demarcationL < a && a <= viewid_demarcationM)
						{
							atomicMax(&g_odata2[coordinator_out], out_data);
							g_scale2[coordinator_out] = a - viewid_demarcationL;
						}
						if (b < int_view_element)
						{
							atomicMax(&g_odata[coordinator_out + 1], out_data);
							g_scale[coordinator_out + 1] = int_view_element - b;
						}
						else if (viewid_demarcationL < b && b <= viewid_demarcationM)
						{
							atomicMax(&g_odata2[coordinator_out + 1], out_data);
							g_scale2[coordinator_out + 1] = b - viewid_demarcationL;
						}
						if (c < int_view_element)
						{
							atomicMax(&g_odata[coordinator_out + 2], out_data);
							g_scale[coordinator_out + 2] = int_view_element - c;
						}
						else if (viewid_demarcationL < c && c <= viewid_demarcationM)
						{
							atomicMax(&g_odata2[coordinator_out + 2], out_data);
							g_scale2[coordinator_out + 2] = c - viewid_demarcationL;
						}
					}
					else
					{
						if (viewid_demarcationM < a && a <= viewid_demarcationR)
						{
							atomicMax(&g_odata[coordinator_out], out_data);
							g_scale[coordinator_out] = viewid_demarcationR - a;
						}
						else if (viewid_demarcationL < a && a <= viewid_demarcationM)
						{
							atomicMax(&g_odata2[coordinator_out], out_data);
							g_scale2[coordinator_out] = a - viewid_demarcationL;
						}
						if (viewid_demarcationM < b && b <= viewid_demarcationR)
						{
							atomicMax(&g_odata[coordinator_out + 1], out_data);
							g_scale[coordinator_out + 1] = viewid_demarcationR - b;
						}
						else if (viewid_demarcationL < b && b <= viewid_demarcationM)
						{
							atomicMax(&g_odata2[coordinator_out + 1], out_data);
							g_scale2[coordinator_out + 1] = b - viewid_demarcationL;
						}
						if (viewid_demarcationM < c && c <= viewid_demarcationR)
						{
							atomicMax(&g_odata[coordinator_out + 2], out_data);
							g_scale[coordinator_out + 2] = viewid_demarcationR - c;
						}
						else if (viewid_demarcationL < c && c <= viewid_demarcationM)
						{
							atomicMax(&g_odata2[coordinator_out + 2], out_data);
							g_scale2[coordinator_out + 2] = c - viewid_demarcationL;
						}
					}
				}
			}
		}
	}
	//else if (indata_d > 0)//右侧边界的问题,是空洞填充问题
	//{
	//	int i;
	//	int ind;
	//	int coordinator_out;
	//	int outx;
	//	int xHole_l, xHole_r;
	//	unsigned int a, b, c;
	//	unsigned int viewid_demarcationL, viewid_demarcationM, viewid_demarcationR;
	//	unsigned int int_view_element = 256;
	//	for (int viewid = 1; viewid <= num_view; viewid++)
	//	{
	//		i = viewid - 1;
	//		viewid_demarcationM = viewid << 8;
	//		viewid_demarcationL = (viewid - 1) << 8;
	//		viewid_demarcationR = (viewid + 1) << 8;
	//		ind = int((indata_d*i)* scale_disparity);
	//		xHole_l = x - ind;
	//		xHole_r = w;
	//		int ind_hole = ind+2;
	//		for (outx = xHole_l; outx < xHole_r; outx++)
	//		{
	//			ind_hole -= 2;
	//			coordinator_out = 3 * (coordinator_y + outx);
	//			if (!isTest)
	//			{
	//				a = unsigned int(g_filter[coordinator_out]);
	//				b = unsigned int(g_filter[coordinator_out + 1]);
	//				c = unsigned int(g_filter[coordinator_out + 2]);
	//				c = 255 - c;
	//				b = 255 - b;
	//				a = 255 - a;
	//				c *= num_view;
	//				b *= num_view;
	//				a *= num_view;
	//			}
	//			else
	//			{
	//				c = 255 * num_view;
	//				b = c;
	//				a = c;
	//			}
	//			//如果是最后一个视点
	//			if (viewid == num_view)
	//			{
	//				if (a < int_view_element)
	//					atomicMax(&g_odata[coordinator_out], ind_hole);
	//				else if (viewid_demarcationL < a && a <= viewid_demarcationM)
	//					atomicMax(&g_odata2[coordinator_out], ind_hole);
	//				if (b < int_view_element)
	//					atomicMax(&g_odata[coordinator_out + 1], ind_hole);
	//				else if (viewid_demarcationL < b && b <= viewid_demarcationM)
	//					atomicMax(&g_odata2[coordinator_out + 1], ind_hole);
	//				if (c < int_view_element)
	//					atomicMax(&g_odata[coordinator_out + 2], ind_hole);
	//				else if (viewid_demarcationL < c && c <= viewid_demarcationM)
	//					atomicMax(&g_odata2[coordinator_out + 2], ind_hole);
	//			}
	//			else//如果不是最后一个视点
	//			{
	//				if (viewid_demarcationM < a && a <= viewid_demarcationR)
	//					atomicMax(&g_odata[coordinator_out], ind_hole);
	//				else if (viewid_demarcationL < a && a <= viewid_demarcationM)
	//					atomicMax(&g_odata2[coordinator_out], ind_hole);
	//				if (viewid_demarcationM < b && b <= viewid_demarcationR)
	//					atomicMax(&g_odata[coordinator_out + 1], ind_hole);
	//				else if (viewid_demarcationL < b && b <= viewid_demarcationM)
	//					atomicMax(&g_odata2[coordinator_out + 1], ind_hole);
	//				if (viewid_demarcationM < c && c <= viewid_demarcationR)
	//					atomicMax(&g_odata[coordinator_out + 2], ind_hole);
	//				else if (viewid_demarcationL < c && c <= viewid_demarcationM)
	//					atomicMax(&g_odata2[coordinator_out + 2], ind_hole);
	//			}
	//		}
	//	}
	//}
}

//合并输出
__global__ void
sumView_disparity(unsigned int *g_odata, void *g_indata, int *g_disparity, int *g_disparity2, 
unsigned char *g_scale, unsigned char *g_scale2,
int w, int h, int num_view, bool isTest)
{
	int y = blockDim.y * blockIdx.y + threadIdx.y;
	int x = blockDim.x * blockIdx.x + threadIdx.x;
	if (y<h&&x<w)
	{
		//没有做边界处理
		int coordinator_out_y = y*w;
		int subpixel_coordinator_out = 3 * (coordinator_out_y+x);

		unsigned char rscale = g_scale[subpixel_coordinator_out];
		unsigned char gscale = g_scale[subpixel_coordinator_out + 1];
		unsigned char bscale = g_scale[subpixel_coordinator_out + 2];

		unsigned char rscale2 =  g_scale2[subpixel_coordinator_out];
		unsigned char gscale2 = g_scale2[subpixel_coordinator_out + 1];
		unsigned char bscale2 = g_scale2[subpixel_coordinator_out + 2];

		int rdisparity = g_disparity[subpixel_coordinator_out];
		int gdisparity = g_disparity[subpixel_coordinator_out + 1];
		int bdisparity = g_disparity[subpixel_coordinator_out + 2];

		int rdisparity2 = g_disparity2[subpixel_coordinator_out];
		int gdisparity2 = g_disparity2[subpixel_coordinator_out + 1];
		int bdisparity2 = g_disparity2[subpixel_coordinator_out + 2];

		int r_x = x + rdisparity; r_x = max(0, min(w - 1, r_x));
		int g_x = x + gdisparity; g_x = max(0, min(w - 1, g_x));
		int b_x = x + bdisparity; b_x = max(0, min(w - 1, b_x));
		unsigned char rindata = ((unsigned char*)g_indata)[4 * (coordinator_out_y + r_x)];
		unsigned char gindata = ((unsigned char*)g_indata)[4 * (coordinator_out_y + g_x) + 1];
		unsigned char bindata = ((unsigned char*)g_indata)[4 * (coordinator_out_y + b_x) + 2];

		int r_x2 = x + rdisparity2; r_x2 = max(0, min(w - 1, r_x2));
		int g_x2 = x + gdisparity2; g_x2 = max(0, min(w - 1, g_x2));
		int b_x2 = x + bdisparity2; b_x2 = max(0, min(w - 1, b_x2));
		unsigned char rindata2 =  ((unsigned char*)g_indata)[4 * (coordinator_out_y + r_x2)];
		unsigned char gindata2 = ((unsigned char*)g_indata)[4 * (coordinator_out_y + g_x2) + 1];
		unsigned char bindata2 = ((unsigned char*)g_indata)[4 * (coordinator_out_y + b_x2) + 2];

		int routdata = (rindata*rscale + rindata2*rscale2)>>8;
		int goutdata = (gindata*gscale + gindata2*gscale2) >> 8;
		int boutdata = (bindata*bscale + bindata2*bscale2) >> 8;

		g_odata[coordinator_out_y + x] = routdata | (goutdata << 8) | (boutdata << 16);
	}
}

__global__ void
mapping_warpingAllView_coordinate_x(unsigned char *g_indata, unsigned char *g_filter, int *g_odata, int *g_odata2,
unsigned char *g_scale, unsigned char *g_scale2,
int w, int h, int num_view, float scale_disparity, int positiveDisparity, bool isTest)
{
	int y = blockDim.y * blockIdx.y + threadIdx.y;
	int x = blockDim.x * blockIdx.x + threadIdx.x;

	if (y<h && x<w)
	{
		int coordinator_y = y*w;

		int outx = 0;
		int coordinator_out;// for filter( 3 channel )		

		unsigned char indata = g_indata[coordinator_y + x];
		int indata_d = indata + positiveDisparity;

		int ind, i;
		unsigned int a, b, c;


		unsigned int viewid_demarcationL, viewid_demarcationM, viewid_demarcationR;
		unsigned int int_view_element = 256;

		int out_data = x;
		for (int viewid = 1; viewid <= num_view; viewid++)
		{
			i = viewid - 1;
			viewid_demarcationM = viewid << 8;
			viewid_demarcationL = (viewid - 1) << 8;
			viewid_demarcationR = (viewid + 1) << 8;

			ind = int((indata_d*i) * scale_disparity);
			outx = x - ind;

			//out_data = ind;
			if (outx<w && outx >= 0)
			{
				coordinator_out = 3 * (coordinator_y + outx);

				if (!isTest)
				{
					a = unsigned int(g_filter[coordinator_out]);//B
					b = unsigned int(g_filter[coordinator_out + 1]);//G
					c = unsigned int(g_filter[coordinator_out + 2]);//R 
					c = 255 - c;
					b = 255 - b;
					a = 255 - a;
					c *= num_view;
					b *= num_view;
					a *= num_view;
				}
				else
				{
					c = 255 * num_view;
					b = c;
					a = c;
				}

				if (viewid == num_view)
				{
					if (a < int_view_element)
					{
						atomicMax(&g_odata[coordinator_out], out_data);
						g_scale[coordinator_out] = int_view_element - a;
					}
					else if (viewid_demarcationL < a && a <= viewid_demarcationM)
					{
						atomicMax(&g_odata2[coordinator_out], out_data);
						g_scale2[coordinator_out] = a - viewid_demarcationL;
					}
					if (b < int_view_element)
					{
						atomicMax(&g_odata[coordinator_out + 1], out_data);
						g_scale[coordinator_out + 1] = int_view_element - b;
					}
					else if (viewid_demarcationL < b && b <= viewid_demarcationM)
					{
						atomicMax(&g_odata2[coordinator_out + 1], out_data);
						g_scale2[coordinator_out + 1] = b - viewid_demarcationL;
					}
					if (c < int_view_element)
					{
						atomicMax(&g_odata[coordinator_out + 2], out_data);
						g_scale[coordinator_out + 2] = int_view_element - c;
					}
					else if (viewid_demarcationL < c && c <= viewid_demarcationM)
					{
						atomicMax(&g_odata2[coordinator_out + 2], out_data);
						g_scale2[coordinator_out + 2] = c - viewid_demarcationL;
					}
				}
				else
				{
					if (viewid_demarcationM < a && a <= viewid_demarcationR)
					{
						atomicMax(&g_odata[coordinator_out], out_data);
						g_scale[coordinator_out] = viewid_demarcationR - a;
					}
					else if (viewid_demarcationL < a && a <= viewid_demarcationM)
					{
						atomicMax(&g_odata2[coordinator_out], out_data);
						g_scale2[coordinator_out] = a - viewid_demarcationL;
					}
					if (viewid_demarcationM < b && b <= viewid_demarcationR)
					{
						atomicMax(&g_odata[coordinator_out + 1], out_data);
						g_scale[coordinator_out + 1] = viewid_demarcationR - b;
					}
					else if (viewid_demarcationL < b && b <= viewid_demarcationM)
					{
						atomicMax(&g_odata2[coordinator_out + 1], out_data);
						g_scale2[coordinator_out + 1] = b - viewid_demarcationL;
					}
					if (viewid_demarcationM < c && c <= viewid_demarcationR)
					{
						atomicMax(&g_odata[coordinator_out + 2], out_data);
						g_scale[coordinator_out + 2] = viewid_demarcationR - c;
					}
					else if (viewid_demarcationL < c && c <= viewid_demarcationM)
					{
						atomicMax(&g_odata2[coordinator_out + 2], out_data);
						g_scale2[coordinator_out + 2] = c - viewid_demarcationL;
					}
				}
			}
		}
	}
}

//occlusion
__global__ void
occlusion_basedGradient_coordinate_x(unsigned char *g_indata, unsigned char *g_filter, int *g_odata, int *g_odata2,
unsigned char *g_scale, unsigned char *g_scale2,
int w, int h, int num_view, float scale_disparity, int positiveDisparity, bool isTest)
{
	int y = blockDim.y * blockIdx.y + threadIdx.y;
	int x = blockDim.x * blockIdx.x + threadIdx.x;
	if (x >= w && y >= h) return;
	int coordinator_y = y*w;
	int coordinator = coordinator_y + x;

	unsigned char indata = g_indata[coordinator];//深度
	int indata_d = indata + positiveDisparity;
	//if (x<w-1)
	{
		unsigned char indataPlus = 0;
		if (x<w - 1)
			indataPlus = g_indata[coordinator + 1];
		int indata_dPlus = indataPlus + positiveDisparity;

		int d_grad = indata_dPlus - indata_d;// compute gradient

		if (d_grad < 0)// hole空洞
		{
			int i;
			int ind, indPlus;
			int coordinator_out;
			int outx;

			int xHole_l, xHole_r;
			int out_data;
			unsigned int viewid_demarcationL, viewid_demarcationM, viewid_demarcationR;
			unsigned int int_view_element = 256;

			unsigned int a, b, c;

			for (int viewid = 1; viewid <= num_view; viewid++)//多视点循环
			{
				i = viewid - 1;
				viewid_demarcationM = viewid << 8;
				viewid_demarcationL = (viewid - 1) << 8;
				viewid_demarcationR = (viewid + 1) << 8;
				ind = int((indata_d*i) * scale_disparity);
				indPlus = int((indata_dPlus*i) * scale_disparity);
				xHole_l = x + 1 - ind;
				xHole_l = max(xHole_l, 0);
				xHole_r = x + 1 - indPlus;
				xHole_r = min(xHole_r, w - 1);

				out_data = x;

				for (outx = xHole_l; outx<xHole_r; outx++)//循环,空洞填充
				{
					out_data ++;

					coordinator_out = 3 * (coordinator_y + outx);
					if (!isTest)
					{
						a = unsigned int(g_filter[coordinator_out]);//B
						b = unsigned int(g_filter[coordinator_out + 1]);//G
						c = unsigned int(g_filter[coordinator_out + 2]);//R 
						c = 255 - c;
						b = 255 - b;
						a = 255 - a;
						c *= num_view;
						b *= num_view;
						a *= num_view;
					}
					else
					{
						c = 255 * num_view;
						b = c;
						a = c;
					}
					if (viewid == num_view)
					{
						if (a < int_view_element)
						{
							atomicMax(&g_odata[coordinator_out], out_data);
							g_scale[coordinator_out] = int_view_element - a;
						}
						else if (viewid_demarcationL < a && a <= viewid_demarcationM)
						{
							atomicMax(&g_odata2[coordinator_out], out_data);
							g_scale2[coordinator_out] = a - viewid_demarcationL;
						}
						if (b < int_view_element)
						{
							atomicMax(&g_odata[coordinator_out + 1], out_data);
							g_scale[coordinator_out + 1] = int_view_element - b;
						}
						else if (viewid_demarcationL < b && b <= viewid_demarcationM)
						{
							atomicMax(&g_odata2[coordinator_out + 1], out_data);
							g_scale2[coordinator_out + 1] = b - viewid_demarcationL;
						}
						if (c < int_view_element)
						{
							atomicMax(&g_odata[coordinator_out + 2], out_data);
							g_scale[coordinator_out + 2] = int_view_element - c;
						}
						else if (viewid_demarcationL < c && c <= viewid_demarcationM)
						{
							atomicMax(&g_odata2[coordinator_out + 2], out_data);
							g_scale2[coordinator_out + 2] = c - viewid_demarcationL;
						}
					}
					else
					{
						if (viewid_demarcationM < a && a <= viewid_demarcationR)
						{
							atomicMax(&g_odata[coordinator_out], out_data);
							g_scale[coordinator_out] = viewid_demarcationR - a;
						}
						else if (viewid_demarcationL < a && a <= viewid_demarcationM)
						{
							atomicMax(&g_odata2[coordinator_out], out_data);
							g_scale2[coordinator_out] = a - viewid_demarcationL;
						}
						if (viewid_demarcationM < b && b <= viewid_demarcationR)
						{
							atomicMax(&g_odata[coordinator_out + 1], out_data);
							g_scale[coordinator_out + 1] = viewid_demarcationR - b;
						}
						else if (viewid_demarcationL < b && b <= viewid_demarcationM)
						{
							atomicMax(&g_odata2[coordinator_out + 1], out_data);
							g_scale2[coordinator_out + 1] = b - viewid_demarcationL;
						}
						if (viewid_demarcationM < c && c <= viewid_demarcationR)
						{
							atomicMax(&g_odata[coordinator_out + 2], out_data);
							g_scale[coordinator_out + 2] = viewid_demarcationR - c;
						}
						else if (viewid_demarcationL < c && c <= viewid_demarcationM)
						{
							atomicMax(&g_odata2[coordinator_out + 2], out_data);
							g_scale2[coordinator_out + 2] = c - viewid_demarcationL;
						}
					}
				}
			}
		}
		if (x == 0 && indata_d<0)//左侧边界的问题,是空洞填充问题
		{
			int i;
			int ind;
			int coordinator_out;
			int outx;
			int xHole_l, xHole_r;
			unsigned int a, b, c;
			unsigned int viewid_demarcationL, viewid_demarcationM, viewid_demarcationR;
			unsigned int int_view_element = 256;

			for (int viewid = 1; viewid <= num_view; viewid++)
			{
				i = viewid - 1;
				viewid_demarcationM = viewid << 8;
				viewid_demarcationL = (viewid - 1) << 8;
				viewid_demarcationR = (viewid + 1) << 8;
				ind = int((indata_d*i)* scale_disparity);

				xHole_l = 0;
				xHole_r = 0 - ind;

				int out_data = 0;
				for (outx = xHole_r - 1; outx >= xHole_l; outx--)
				{
					out_data++;

					coordinator_out = 3 * (coordinator_y + outx);
					if (!isTest)
					{
						a = unsigned int(g_filter[coordinator_out]);
						b = unsigned int(g_filter[coordinator_out + 1]);
						c = unsigned int(g_filter[coordinator_out + 2]);
						c = 255 - c;
						b = 255 - b;
						a = 255 - a;
						c *= num_view;
						b *= num_view;
						a *= num_view;
					}
					else
					{
						c = 255 * num_view;
						b = c;
						a = c;
					}
					if (viewid == num_view)
					{
						if (a < int_view_element)
						{
							atomicMax(&g_odata[coordinator_out], out_data);
							g_scale[coordinator_out] = int_view_element - a;
						}
						else if (viewid_demarcationL < a && a <= viewid_demarcationM)
						{
							atomicMax(&g_odata2[coordinator_out], out_data);
							g_scale2[coordinator_out] = a - viewid_demarcationL;
						}
						if (b < int_view_element)
						{
							atomicMax(&g_odata[coordinator_out + 1], out_data);
							g_scale[coordinator_out + 1] = int_view_element - b;
						}
						else if (viewid_demarcationL < b && b <= viewid_demarcationM)
						{
							atomicMax(&g_odata2[coordinator_out + 1], out_data);
							g_scale2[coordinator_out + 1] = b - viewid_demarcationL;
						}
						if (c < int_view_element)
						{
							atomicMax(&g_odata[coordinator_out + 2], out_data);
							g_scale[coordinator_out + 2] = int_view_element - c;
						}
						else if (viewid_demarcationL < c && c <= viewid_demarcationM)
						{
							atomicMax(&g_odata2[coordinator_out + 2], out_data);
							g_scale2[coordinator_out + 2] = c - viewid_demarcationL;
						}
					}
					else
					{
						if (viewid_demarcationM < a && a <= viewid_demarcationR)
						{
							atomicMax(&g_odata[coordinator_out], out_data);
							g_scale[coordinator_out] = viewid_demarcationR - a;
						}
						else if (viewid_demarcationL < a && a <= viewid_demarcationM)
						{
							atomicMax(&g_odata2[coordinator_out], out_data);
							g_scale2[coordinator_out] = a - viewid_demarcationL;
						}
						if (viewid_demarcationM < b && b <= viewid_demarcationR)
						{
							atomicMax(&g_odata[coordinator_out + 1], out_data);
							g_scale[coordinator_out + 1] = viewid_demarcationR - b;
						}
						else if (viewid_demarcationL < b && b <= viewid_demarcationM)
						{
							atomicMax(&g_odata2[coordinator_out + 1], out_data);
							g_scale2[coordinator_out + 1] = b - viewid_demarcationL;
						}
						if (viewid_demarcationM < c && c <= viewid_demarcationR)
						{
							atomicMax(&g_odata[coordinator_out + 2], out_data);
							g_scale[coordinator_out + 2] = viewid_demarcationR - c;
						}
						else if (viewid_demarcationL < c && c <= viewid_demarcationM)
						{
							atomicMax(&g_odata2[coordinator_out + 2], out_data);
							g_scale2[coordinator_out + 2] = c - viewid_demarcationL;
						}
					}
				}
			}
		}
	}
}

__global__ void
sumView_coordinate_x(unsigned int *g_odata, void *g_indata, int *g_disparity, int *g_disparity2,
unsigned char *g_scale, unsigned char *g_scale2,
int w, int h, int num_view, bool isTest)
{
	int y = blockDim.y * blockIdx.y + threadIdx.y;
	int x = blockDim.x * blockIdx.x + threadIdx.x;
	if (y<h&&x<w)
	{
		//没有做边界处理
		int coordinator_out_y = y*w;
		int subpixel_coordinator_out = 3 * (coordinator_out_y + x);

		unsigned char rscale = g_scale[subpixel_coordinator_out];
		unsigned char gscale = g_scale[subpixel_coordinator_out + 1];
		unsigned char bscale = g_scale[subpixel_coordinator_out + 2];

		unsigned char rscale2 = g_scale2[subpixel_coordinator_out];
		unsigned char gscale2 = g_scale2[subpixel_coordinator_out + 1];
		unsigned char bscale2 = g_scale2[subpixel_coordinator_out + 2];

		int r_x = g_disparity[subpixel_coordinator_out];
		int g_x = g_disparity[subpixel_coordinator_out + 1];
		int b_x = g_disparity[subpixel_coordinator_out + 2];

		int r_x2 = g_disparity2[subpixel_coordinator_out];
		int g_x2 = g_disparity2[subpixel_coordinator_out + 1];
		int b_x2 = g_disparity2[subpixel_coordinator_out + 2];

		 r_x = max(0, min(w - 1, r_x));
		 g_x = max(0, min(w - 1, g_x));
		 b_x = max(0, min(w - 1, b_x));
		unsigned char rindata = ((unsigned char*)g_indata)[4 * (coordinator_out_y + r_x)];
		unsigned char gindata = ((unsigned char*)g_indata)[4 * (coordinator_out_y + g_x) + 1];
		unsigned char bindata = ((unsigned char*)g_indata)[4 * (coordinator_out_y + b_x) + 2];

		r_x2 = max(0, min(w - 1, r_x2));
		g_x2 = max(0, min(w - 1, g_x2));
		b_x2 = max(0, min(w - 1, b_x2));
		unsigned char rindata2 = ((unsigned char*)g_indata)[4 * (coordinator_out_y + r_x2)];
		unsigned char gindata2 = ((unsigned char*)g_indata)[4 * (coordinator_out_y + g_x2) + 1];
		unsigned char bindata2 = ((unsigned char*)g_indata)[4 * (coordinator_out_y + b_x2) + 2];

		int routdata = (rindata*rscale + rindata2*rscale2) >> 8;
		int goutdata = (gindata*gscale + gindata2*gscale2) >> 8;
		int boutdata = (bindata*bscale + bindata2*bscale2) >> 8;

		g_odata[coordinator_out_y + x] = routdata | (goutdata << 8) | (boutdata << 16);
	}
}


//----- mapping and warping ---//

//-----depth pre-processing--//
__global__ void
gn_medianFilter_disparity(unsigned int *g_indata, int w, int h)
{
	int y = blockDim.y * blockIdx.y + threadIdx.y;
	int x = blockDim.x * blockIdx.x + threadIdx.x;

	int window[9];
	if (x >= w && y >= h) return;
	window[0] = (y == 0 || x == 0) ? 0 : g_indata[(y - 1)* w + x - 1] >> 24;
	window[1] = (y == 0) ? 0 : g_indata[(y - 1)* w + x] >> 24;
	window[2] = (y == 0 || x == w - 1) ? 0 : g_indata[(y - 1)* w + x + 1] >> 24;
	window[3] = (x == 0) ? 0 : g_indata[y* w + x - 1] >> 24;
	window[4] = g_indata[y* w + x] >> 24;
	window[5] = (x == w - 1) ? 0 : g_indata[y* w + x + 1] >> 24;
	window[6] = (y == h - 1 || x == 0) ? 0 : g_indata[(y + 1)* w + x - 1] >> 24;
	window[7] = (y == h - 1) ? 0 : g_indata[(y + 1)* w + x] >> 24;
	window[8] = (y == h - 1 || x == w - 1) ? 0 : g_indata[(y + 1)* w + x + 1] >> 24;
	for (unsigned int j = 0; j<5; ++j)
	{
		int min = j;
		for (unsigned int l = j + 1; l<9; ++l)
			if (window[l] < window[min])
				min = l;
		const float temp = window[j];
		window[j] = window[min];
		window[min] = temp;
	}
	((unsigned char*)g_indata)[4 * (y* w + x) + 3] = (unsigned char)(window[4]);
}

__global__ void
gn_dilate(unsigned int *g_indata, int w, int h)
{
	int y = blockDim.y * blockIdx.y + threadIdx.y;
	int x = blockDim.x * blockIdx.x + threadIdx.x;

	int window[9];
	if (x >= w && y >= h) return;
	window[0] = (y == 0 || x == 0) ? 0 : g_indata[(y - 1)* w + x - 1] >> 24;
	window[1] = (y == 0) ? 0 : g_indata[(y - 1)* w + x] >> 24;
	window[2] = (y == 0 || x == w - 1) ? 0 : g_indata[(y - 1)* w + x + 1] >> 24;
	window[3] = (x == 0) ? 0 : g_indata[y* w + x - 1] >> 24;
	window[4] = g_indata[y* w + x] >> 24;
	window[5] = (x == w - 1) ? 0 : g_indata[y* w + x + 1] >> 24;
	window[6] = (y == h - 1 || x == 0) ? 0 : g_indata[(y + 1)* w + x - 1] >> 24;
	window[7] = (y == h - 1) ? 0 : g_indata[(y + 1)* w + x] >> 24;
	window[8] = (y == h - 1 || x == w - 1) ? 0 : g_indata[(y + 1)* w + x + 1] >> 24;
	int max_value = window[0];
	for (unsigned int i = 0; i < 9; ++i)
	{
		if (max_value < window[i])
			max_value = window[i];
	}

	((unsigned char*)g_indata)[4 * (y* w + x) + 3] = (unsigned char)max_value;
}

__global__ void
gn_dilate(unsigned char *g_indata, int w, int h)
{
	int y = blockDim.y * blockIdx.y + threadIdx.y;
	int x = blockDim.x * blockIdx.x + threadIdx.x;

	unsigned char window[9];
	if (x >= w && y >= h) return;
	window[0] = (y == 0 || x == 0) ? 0 : g_indata[(y - 1)* w + x - 1];
	window[1] = (y == 0) ? 0 : g_indata[(y - 1)* w + x] ;
	window[2] = (y == 0 || x == w - 1) ? 0 : g_indata[(y - 1)* w + x + 1];
	window[3] = (x == 0) ? 0 : g_indata[y* w + x - 1];
	window[4] = g_indata[y* w + x];
	window[5] = (x == w - 1) ? 0 : g_indata[y* w + x + 1];
	window[6] = (y == h - 1 || x == 0) ? 0 : g_indata[(y + 1)* w + x - 1];
	window[7] = (y == h - 1) ? 0 : g_indata[(y + 1)* w + x];
	window[8] = (y == h - 1 || x == w - 1) ? 0 : g_indata[(y + 1)* w + x + 1];
	unsigned char max_value = window[4];
	for (int i = 0; i < 9; ++i)
	{
		if (max_value < window[i])
			max_value = window[i];
	}
	g_indata[y* w + x] = max_value;
}

__global__ void
gn_erode(unsigned int *g_indata, int w, int h)
{
	int y = blockDim.y * blockIdx.y + threadIdx.y;
	int x = blockDim.x * blockIdx.x + threadIdx.x;

	int window[9];
	if (x >= w && y >= h) return;
	window[0] = (y == 0 || x == 0) ? 0 : g_indata[(y - 1)* w + x - 1] >> 24;
	window[1] = (y == 0) ? 0 : g_indata[(y - 1)* w + x] >> 24;
	window[2] = (y == 0 || x == w - 1) ? 0 : g_indata[(y - 1)* w + x + 1] >> 24;
	window[3] = (x == 0) ? 0 : g_indata[y* w + x - 1] >> 24;
	window[4] = g_indata[y* w + x] >> 24;
	window[5] = (x == w - 1) ? 0 : g_indata[y* w + x + 1] >> 24;
	window[6] = (y == h - 1 || x == 0) ? 0 : g_indata[(y + 1)* w + x - 1] >> 24;
	window[7] = (y == h - 1) ? 0 : g_indata[(y + 1)* w + x] >> 24;
	window[8] = (y == h - 1 || x == w - 1) ? 0 : g_indata[(y + 1)* w + x + 1] >> 24;
	int min_value = window[0];
	for (unsigned int i = 0; i < 9; ++i)
	{
		if (min_value > window[i])
			min_value = window[i];
	}

	((unsigned char*)g_indata)[4 * (y* w + x) + 3] = (unsigned char)min_value;
}

__global__ void
gn_erode_disparity3c(unsigned int *g_indata, int w, int h)
{
	int y = blockDim.y * blockIdx.y + threadIdx.y;
	int x = blockDim.x * blockIdx.x + threadIdx.x;

	int window[9];
	if (x >= w && y >= h) return;
	window[0] = (y == 0 || x == 0) ? 0 : g_indata[(y - 1)* w + x - 1];
	window[1] = (y == 0) ? 0 : g_indata[(y - 1)* w + x] ;
	window[2] = (y == 0 || x == w - 1) ? 0 : g_indata[(y - 1)* w + x + 1];
	window[3] = (x == 0) ? 0 : g_indata[y* w + x - 1] ;
	window[4] = g_indata[y* w + x] ;
	window[5] = (x == w - 1) ? 0 : g_indata[y* w + x + 1];
	window[6] = (y == h - 1 || x == 0) ? 0 : g_indata[(y + 1)* w + x - 1] ;
	window[7] = (y == h - 1) ? 0 : g_indata[(y + 1)* w + x] ;
	window[8] = (y == h - 1 || x == w - 1) ? 0 : g_indata[(y + 1)* w + x + 1] ;
	
	int r = (window[0] & 0x00ff0000) >> 16;
	int g = (window[0] & 0x0000ff00) >> 8;
	int b = window[0] & 0x000000ff;
	int min_value = min(r, min(g, b));
	int value_i;
	for (unsigned int i = 1; i < 9; ++i)
	{
		r = (window[i] & 0x00ff0000) >> 16;
		g = (window[i] & 0x0000ff00) >> 8;
		b = window[i] & 0x000000ff;
		value_i = min(r, min(g, b));
		if (min_value > value_i)
			min_value = value_i;
	}

	((unsigned char*)g_indata)[4 * (y* w + x) ] = (unsigned char)min_value;
}
//-----depth pre-processing--//


__global__ void
gn_copy2DRGB(unsigned int *g_data, uchar* g_indata, int w, int h)// test
{
	int y = blockDim.y * blockIdx.y + threadIdx.y;
	int x = blockDim.x * blockIdx.x + threadIdx.x;
	if (x >= w && y >= h) return;
	int coordinate = y*w + x;
	uchar disp = g_indata[coordinate];
	((unsigned char*)g_data)[4*coordinate+3] = disp;
}

__global__ void
gn_copy_float2DRGB(uint *g_data, float* g_indata, int w, int h)// test
{
	int y = blockDim.y * blockIdx.y + threadIdx.y;
	int x = blockDim.x * blockIdx.x + threadIdx.x;
	if (x >= w && y >= h) return;
	int coordinate = y*w + x;
	uchar data = (uchar)(g_indata[coordinate]);

	((uchar*)g_data)[4 * coordinate+3] = data;
}

__global__ void 
gn_copy_uint(uint* g_outdata, uint *g_indata, int w, int h)
{
	int y = blockDim.y * blockIdx.y + threadIdx.y;
	int x = blockDim.x * blockIdx.x + threadIdx.x;
	if (x >= w && y >= h) return;
	int coordinate = y*w + x;
	g_outdata[coordinate] = g_indata[coordinate];
}

__global__ void
gn_copy_DRGB2out(unsigned char *g_data, uint* g_indata, int w, int h)// test
{
	int y = blockDim.y * blockIdx.y + threadIdx.y;
	int x = blockDim.x * blockIdx.x + threadIdx.x;
	if (x >= w && y >= h) return;
	int coordinate = y*w + x;
	uchar data = (uchar)(g_indata[coordinate]>>24);
	g_data[3 * coordinate] = data;
	g_data[3 * coordinate + 1] = data;
	g_data[3 * coordinate + 2] = data;
}

__global__ void
disparity_deassign(uint *g_data, uchar *g_filter, uchar* g_indata, int w, int h)
{
	int y = blockDim.y * blockIdx.y + threadIdx.y;
	int x = blockDim.x * blockIdx.x + threadIdx.x;
	if (x >= w && y >= h) return;
	int coordinate = y*w + x;
	uchar data = g_filter[coordinate];

	uchar d_label = g_indata[coordinate];
	uchar d_label_left = d_label;
	uchar d_label_right = d_label;
	if (d_label == 0)
	{
		int left_edge = 0, right_edge = 0;
		while (d_label_right == 0 && right_edge<10 && ((x + right_edge)<w))//
		{
			right_edge++;
			d_label_right = g_indata[coordinate + right_edge];
		}
		while (d_label_left == 0 && left_edge<10 && ((x - left_edge)>0))//
		{
			left_edge++;
			d_label_left = g_indata[coordinate - left_edge];
		}

		if (d_label_left > d_label_right)//空洞部分的边缘
		{
			if (data>d_label_right)
				((uchar*)g_data)[4 * coordinate + 3] = d_label_left;
			//if ((abs(d_label_left - data) < abs(data - d_label_right ))&& d_label_left != 0)// 
			//	((uchar*)g_data)[4 * coordinate + 3] = d_label_left;
			//else if (d_label_right!=0)
			//	((uchar*)g_data)[4 * coordinate + 3] = d_label_right;
		}
		else
		{
			if ((abs(data - d_label_left) < abs(d_label_right - data)) && d_label_left != 0)//
				((uchar*)g_data)[4 * coordinate + 3] = d_label_left;
			else if (d_label_right != 0)
				((uchar*)g_data)[4 * coordinate + 3] = d_label_right;
		}

	}
}



extern "C" void InitInBuffer(int w, int h)
{
	imgw = w;
	imgh = h;
	img_w = w / 2;
	img_h = h;

	frame_size_in_3c = sizeof(unsigned char) * imgw* imgh * 3;//3通道
	frame_size_in_4c = sizeof(unsigned char) * imgw* imgh * 4;//4通道
	frame_size_in_drgb = sizeof(unsigned int)*imgw*imgh;
	frame_size_in_1c = sizeof(unsigned char)*imgw*imgh;//1通道

	checkCudaErrors(cudaMalloc((void **)&d_frame_3c, frame_size_in_3c));
	checkCudaErrors(cudaMalloc((void **)&d_frame_4c, frame_size_in_4c));
	checkCudaErrors(cudaMalloc((void **)&d_frame_rgbd, frame_size_in_4c));
	checkCudaErrors(cudaMalloc((void **)&d_frame_drgb, frame_size_in_drgb));
	checkCudaErrors(cudaMalloc((void **)&d_frame_depth, frame_size_in_3c));//d_frame_depth_3c
	checkCudaErrors(cudaMalloc((void **)&d_frame_depth_4c, frame_size_in_4c));//d_frame_depth_4c

	block_mix = dim3(256, 1, 1);
	grid_mix = dim3(iDivUp(imgw, block_mix.x), iDivUp(imgh, block_mix.y), 1);

	checkCudaErrors(cudaMallocPitch(&dImgL_src, &pitch, sizeof(unsigned int)*img_w, img_h));

	checkCudaErrors(cudaMallocPitch(&dImgL_turn, &pitch, sizeof(unsigned int)*img_w, img_h));//no use

	block_init = dim3(256, 1, 1);
	grid_init = dim3(iDivUp(img_w, block_init.x), iDivUp(img_h, block_init.y), 1);

	BLOCK_NUM_init = iDivUp(img_w, block_init.x)*iDivUp(img_h, block_init.y);

	checkCudaErrors(cudaMalloc((void **)&d_Depth_in, sizeof(uchar)*img_w*img_h));//d_Depth_in 8*img_w*img_h 
}

extern "C" void InitOutBufferAndFliter_BGR(int w, int h, unsigned char *g_filter)
{
	out_w = w;
	out_h = h;

	checkCudaErrors(cudaMallocPitch(&out_frame_3c, &pitch, sizeof(unsigned char)*out_w * 3, out_h));
	checkCudaErrors(cudaMallocPitch(&out_frame_4c, &pitch, sizeof(unsigned char)*out_w * 4, out_h));
	framesize_out_3c = sizeof(unsigned char)*out_w*out_h * 3;
	framesize_out_4c = sizeof(unsigned char)*out_w*out_h * 4;

	checkCudaErrors(cudaMallocPitch(&dImage, &pitch, sizeof(unsigned int)*out_w, out_h));
	checkCudaErrors(cudaMallocPitch(&dView, &pitch, sizeof(unsigned int)*out_w, out_h));

	checkCudaErrors(cudaMallocPitch(&dImgL, &pitch, sizeof(unsigned int)*out_w, out_h));

	checkCudaErrors(cudaMallocPitch(&dImgD, &pitch, sizeof(int)*out_w*3, out_h));//偏移量
	checkCudaErrors(cudaMallocPitch(&dImgD2, &pitch, sizeof(int)*out_w*3, out_h));

	checkCudaErrors(cudaMallocPitch(&d_depth_scale, &pitch, sizeof(unsigned char)*out_w * 3, out_h));
	checkCudaErrors(cudaMallocPitch(&d_depth_scale2, &pitch, sizeof(unsigned char)*out_w * 3, out_h)); 

	checkCudaErrors(cudaMalloc((void **)&d_Depth_view0, sizeof(uchar)*out_w*out_h));//d_Depth_view0 8*out_w*out_h
	

	block_multiview = dim3(256, 1, 1);
	grid_multiview = dim3(iDivUp(out_w, block_multiview.x), iDivUp(out_h, block_multiview.y), 1);

	multiplex = (float)out_w / (float)imgw;
	multipley = (float)out_h / (float)imgh;

	multiple_x = (float)out_w / (float)img_w;
	multiple_y = (float)out_h / (float)img_h;

	upsample_x = multiple_x;
	upsample_y = multiple_y;

	block_filter = dim3(256, 1, 1);
	grid_filter = dim3(iDivUp(out_w, block_filter.x), iDivUp(out_h, block_filter.y), 1);

	//BGR
	int filterSizeBGR = sizeof(unsigned char) * out_w* out_h * 3;
	checkCudaErrors(cudaMalloc((void **)&d_filter_BGR, filterSizeBGR));
	checkCudaErrors(cudaMemcpy(d_filter_BGR, g_filter, filterSizeBGR, cudaMemcpyHostToDevice));

	//ARGB
	int filterSizeARGB = sizeof(unsigned char) * out_w* out_h * 4;
	checkCudaErrors(cudaMalloc((void **)&d_filter_ARGB, filterSizeARGB));

	//RGBA
	int filterSizeRGBA = sizeof(unsigned char) * out_w* out_h * 4;
	checkCudaErrors(cudaMalloc((void **)&d_filter_RGBA, filterSizeRGBA));

	//BGRA
	int filterSizeBGRA = sizeof(unsigned char) * out_w* out_h * 4;
	checkCudaErrors(cudaMalloc((void **)&d_filter_BGRA, filterSizeBGRA));

	changeFilter_BGRtoABGR << <grid_filter,block_filter>> >(d_filter_BGR, d_filter_ARGB, out_w, out_h);

}

extern "C" void InitOutBufferAndFliter_RGBA(int w, int h, unsigned char *g_filter)
{
	out_w = w;
	out_h = h;

	checkCudaErrors(cudaMallocPitch(&out_frame_3c, &pitch, sizeof(unsigned char)*out_w * 3, out_h));
	checkCudaErrors(cudaMallocPitch(&out_frame_4c, &pitch, sizeof(unsigned char)*out_w * 4, out_h));
	framesize_out_3c = sizeof(unsigned char)*out_w*out_h * 3;
	framesize_out_4c = sizeof(unsigned char)*out_w*out_h * 4;

	checkCudaErrors(cudaMallocPitch(&dImage, &pitch, sizeof(unsigned int)*out_w, out_h));
	checkCudaErrors(cudaMallocPitch(&dView, &pitch, sizeof(unsigned int)*out_w, out_h));

	checkCudaErrors(cudaMallocPitch(&dImgL, &pitch, sizeof(unsigned int)*out_w, out_h));

	block_multiview = dim3(256, 1, 1);
	grid_multiview = dim3(iDivUp(out_w, block_multiview.x), iDivUp(out_h, block_multiview.y), 1);

	multiplex = (float)out_w / (float)imgw;
	multipley = (float)out_h / (float)imgh;

	multiple_x = (float)out_w / (float)img_w;
	multiple_y = (float)out_h / (float)img_h;

	upsample_x = multiple_x;
	upsample_y = multiple_y;


	//BGR
	int filterSizeBGR = sizeof(unsigned char) * out_w* out_h * 3;
	checkCudaErrors(cudaMalloc((void **)&d_filter_BGR, filterSizeBGR));

	//ARGB
	int filterSizeARGB = sizeof(unsigned char) * out_w* out_h * 4;
	checkCudaErrors(cudaMalloc((void **)&d_filter_ARGB, filterSizeARGB));

	//BGRA
	int filterSizeBGRA = sizeof(unsigned char) * out_w* out_h * 4;
	checkCudaErrors(cudaMalloc((void **)&d_filter_BGRA, filterSizeBGRA));

	//RGBA
	int filterSizeRGBA = sizeof(unsigned char) * out_w* out_h * 4;
	checkCudaErrors(cudaMalloc((void **)&d_filter_RGBA, filterSizeRGBA));
	checkCudaErrors(cudaMemcpy(d_filter_RGBA, g_filter, filterSizeRGBA, cudaMemcpyHostToDevice));
	block_filter = dim3(256, 1, 1);
	grid_filter = dim3(iDivUp(out_w, block_filter.x), iDivUp(out_h, block_filter.y), 1);
	changeFilter_RGBAtoBGR << <grid_filter, block_filter >> >(d_filter_RGBA, d_filter_BGR, out_w, out_h);

}

extern "C" void InitOutBufferAndFliter_ARGB(int w, int h, unsigned char *g_filter)
{
	out_w = w;
	out_h = h;

	checkCudaErrors(cudaMallocPitch(&out_frame_3c, &pitch, sizeof(unsigned char)*out_w * 3, out_h));
	checkCudaErrors(cudaMallocPitch(&out_frame_4c, &pitch, sizeof(unsigned char)*out_w * 4, out_h));
	framesize_out_3c = sizeof(unsigned char)*out_w*out_h * 3;
	framesize_out_4c = sizeof(unsigned char)*out_w*out_h * 4;

	checkCudaErrors(cudaMallocPitch(&dImage, &pitch, sizeof(unsigned int)*out_w, out_h));
	checkCudaErrors(cudaMallocPitch(&dView, &pitch, sizeof(unsigned int)*out_w, out_h));

	checkCudaErrors(cudaMallocPitch(&dImgL, &pitch, sizeof(unsigned int)*out_w, out_h));

	block_multiview = dim3(256, 1, 1);
	grid_multiview = dim3(iDivUp(out_w, block_multiview.x), iDivUp(out_h, block_multiview.y), 1);

	multiplex = (float)out_w / (float)imgw;
	multipley = (float)out_h / (float)imgh;

	multiple_x = (float)out_w / (float)img_w;
	multiple_y = (float)out_h / (float)img_h;

	upsample_x = multiple_x;
	upsample_y = multiple_y;


	//BGR
	int filterSizeBGR = sizeof(unsigned char) * out_w* out_h * 3;
	checkCudaErrors(cudaMalloc((void **)&d_filter_BGR, filterSizeBGR));

	//RGBA
	int filterSizeRGBA = sizeof(unsigned char) * out_w* out_h * 4;
	checkCudaErrors(cudaMalloc((void **)&d_filter_RGBA, filterSizeRGBA));

	//BGRA
	int filterSizeBGRA = sizeof(unsigned char) * out_w* out_h * 4;
	checkCudaErrors(cudaMalloc((void **)&d_filter_BGRA, filterSizeBGRA));

	//ARGB
	int filterSizeARGB = sizeof(unsigned char) * out_w* out_h * 4;
	checkCudaErrors(cudaMalloc((void **)&d_filter_ARGB, filterSizeARGB));
	checkCudaErrors(cudaMemcpy(d_filter_ARGB, g_filter, filterSizeARGB, cudaMemcpyHostToDevice));
	block_filter = dim3(256, 1, 1);
	grid_filter = dim3(iDivUp(out_w, block_filter.x), iDivUp(out_h, block_filter.y), 1);
	changeFilter_ARGBtoBGR << <grid_filter, block_filter >> >(d_filter_ARGB, d_filter_BGR, out_w, out_h);

}

extern "C" void InitOutBufferAndFliter_BGRA(int w, int h, unsigned char *g_filter)
{
	out_w = w;
	out_h = h;

	checkCudaErrors(cudaMallocPitch(&out_frame_3c, &pitch, sizeof(unsigned char)*out_w * 3, out_h));
	checkCudaErrors(cudaMallocPitch(&out_frame_4c, &pitch, sizeof(unsigned char)*out_w * 4, out_h));
	framesize_out_3c = sizeof(unsigned char)*out_w*out_h * 3;
	framesize_out_4c = sizeof(unsigned char)*out_w*out_h * 4;

	checkCudaErrors(cudaMallocPitch(&dImage, &pitch, sizeof(unsigned int)*out_w, out_h));
	checkCudaErrors(cudaMallocPitch(&dView, &pitch, sizeof(unsigned int)*out_w, out_h));

	checkCudaErrors(cudaMallocPitch(&dImgL, &pitch, sizeof(unsigned int)*out_w, out_h));

	block_multiview = dim3(256, 1, 1);
	grid_multiview = dim3(iDivUp(out_w, block_multiview.x), iDivUp(out_h, block_multiview.y), 1);

	multiplex = (float)out_w / (float)imgw;
	multipley = (float)out_h / (float)imgh;

	multiple_x = (float)out_w / (float)img_w;
	multiple_y = (float)out_h / (float)img_h;

	upsample_x = multiple_x;
	upsample_y = multiple_y;

	//ARGB
	int filterSizeARGB = sizeof(unsigned char) * out_w* out_h * 4;
	checkCudaErrors(cudaMalloc((void **)&d_filter_ARGB, filterSizeARGB));

	//BGR
	int filterSizeBGR = sizeof(unsigned char) * out_w* out_h * 3;
	checkCudaErrors(cudaMalloc((void **)&d_filter_BGR, filterSizeBGR));

	//RGBA
	int filterSizeRGBA = sizeof(unsigned char) * out_w* out_h * 4;
	checkCudaErrors(cudaMalloc((void **)&d_filter_RGBA, filterSizeRGBA));

	//BGRA
	int filterSizeBGRA = sizeof(unsigned char) * out_w* out_h * 4;
	checkCudaErrors(cudaMalloc((void **)&d_filter_BGRA, filterSizeBGRA));
	checkCudaErrors(cudaMemcpy(d_filter_BGRA, g_filter, filterSizeBGRA, cudaMemcpyHostToDevice));
	block_filter = dim3(256, 1, 1);
	grid_filter = dim3(iDivUp(out_w, block_filter.x), iDivUp(out_h, block_filter.y), 1);
	changeFilter_BGRAtoBGR << <grid_filter, block_filter >> >(d_filter_BGRA, d_filter_BGR, out_w, out_h);

}

extern "C" void CudaFreeBuffer()
{
	checkCudaErrors(cudaFree(dImgL_src));
	checkCudaErrors(cudaFree(dImgL_turn));
	checkCudaErrors(cudaFree(dImgL));
	checkCudaErrors(cudaFree(dImage));
	checkCudaErrors(cudaFree(dView));
	checkCudaErrors(cudaFree(d_frame_3c));
	checkCudaErrors(cudaFree(d_frame_4c));
	checkCudaErrors(cudaFree(d_frame_depth));
	checkCudaErrors(cudaFree(d_frame_depth_4c));
	checkCudaErrors(cudaFree(d_frame_rgbd));
	checkCudaErrors(cudaFree(d_frame_drgb));
	checkCudaErrors(cudaFree(d_filter_BGR));
	checkCudaErrors(cudaFree(d_filter_RGBA));
	checkCudaErrors(cudaFree(d_filter_ARGB));
	checkCudaErrors(cudaFree(d_filter_BGRA));
	checkCudaErrors(cudaFree(out_frame_3c));
	checkCudaErrors(cudaFree(out_frame_4c));
}

extern "C" void CudaProcess_DIBR_BGR(unsigned char *g_inframe, unsigned char *g_odata, char *g_oimg_disparity,int num_view, float k = 0.625, int zero_disparity=-128,  bool isTest = false)
{
	cudaEvent_t start1;
	cudaEventCreate(&start1);
	cudaEvent_t stop1;
	cudaEventCreate(&stop1);	
	cudaEventRecord(start1, NULL);

	cudaMemcpy(d_frame_3c, g_inframe, frame_size_in_3c, cudaMemcpyHostToDevice);

	changeCV_FRAMEtoDRGB << <grid_init, block_init >> >(d_frame_3c, dImgL_src, img_w, img_h);
  

	changeCV_FRAMEtoD << <grid_init, block_init >> >(d_frame_3c, d_Depth_in, img_w, img_h);
	gn_dilate << <grid_init, block_init >> >(d_Depth_in, img_w, img_h);//深度图处理
	upSampleDepth << <grid_multiview, block_multiview >> >(d_Depth_in, d_Depth_view0, out_h, out_w, upsample_x, upsample_y);

	//cudaMemcpy(g_oimg_disparity, d_Depth_view0, sizeof(uchar)*out_w*out_h, cudaMemcpyDeviceToHost);	//return;
		
	upSampleImage << <grid_multiview, block_multiview >> >(dImgL_src, dImgL, out_h, out_w, upsample_x, upsample_y);
	initRGBA << <grid_multiview, block_multiview >> >(dView, out_w, out_h);
	initRGBA << <grid_multiview, block_multiview >> >(dImage, out_w, out_h);

	initRGBA3 << <grid_multiview, block_multiview >> >(dImgD, -zero_disparity*2, out_w, out_h);
	initRGBA3 << <grid_multiview, block_multiview >> >(dImgD2, -zero_disparity*2, out_w, out_h);

	initD3 << <grid_multiview, block_multiview >> >(d_depth_scale, 0, out_w, out_h);
	initD3 << <grid_multiview, block_multiview >> >(d_depth_scale2, 0, out_w, out_h);

	//-----参数计算----
	float scale_disparity = 21.25;
	if (k>0 && k<10)	
		scale_disparity = float(k*out_w / 25600.0);
	zero_disparity = min(255, max(zero_disparity, 0));
	
	//---合成----//

	mapping_warpingAllView_disparity << <grid_multiview, block_multiview >> >(d_Depth_view0, d_filter_BGR, dImgD, dImgD2, d_depth_scale, d_depth_scale2,out_w, out_h, num_view, scale_disparity, -zero_disparity, isTest);
	occlusion_basedGradient_disparity << <grid_multiview, block_multiview >> >(d_Depth_view0, d_filter_BGR, dImgD, dImgD2, d_depth_scale, d_depth_scale2, out_w, out_h, num_view, scale_disparity, -zero_disparity, isTest);
	sumView_disparity << <grid_multiview, block_multiview >> >(dView, dImgL, dImgD, dImgD2, d_depth_scale, d_depth_scale2, out_w, out_h, num_view, isTest);

	//mapping_warpingAllView_coordinate_x << <grid_multiview, block_multiview >> >(d_Depth_view0, d_filter_BGR, dImgD, dImgD2, d_depth_scale, d_depth_scale2, out_w, out_h, num_view, scale_disparity, -zero_disparity, isTest);
	//occlusion_basedGradient_coordinate_x << <grid_multiview, block_multiview >> >(d_Depth_view0, d_filter_BGR, dImgD, dImgD2, d_depth_scale, d_depth_scale2, out_w, out_h, num_view, scale_disparity, -zero_disparity, isTest);
	//sumView_coordinate_x << <grid_multiview, block_multiview >> >(dView, dImgL, dImgD, dImgD2, d_depth_scale, d_depth_scale2, out_w, out_h, num_view, isTest);

	//---合成----// 


	cudaEventRecord(stop1, NULL);
	cudaEventSynchronize(stop1);
	float msecTotal1 = 0.0f;
	cudaEventElapsedTime(&msecTotal1, start1, stop1);
	cout << "GPU处理时间: " << msecTotal1 << "ms" << endl;
	OutputFormat_BGR << <grid_multiview, block_multiview >> >(dView, out_frame_3c, out_w, out_h);
	//OutputFormat_Disparity << <grid_multiview, block_multiview >> >(dImgL, out_frame_3c, out_w, out_h);
	cudaMemcpy(g_odata, out_frame_3c, framesize_out_3c, cudaMemcpyDeviceToHost);
}

主要参与人员

邢树军 于迅博 刘博阳 叶韵菲 刘彤彤