Support Website Contact Support Blog

Going from pixel position to world position using Pix4D calibration


I’ve written a software that detects spray painted GCPs using convolutional networks. The detection works OK now, but I need to go from the pixel position to the real world, 3d position in order to know which GCP corresponds to which detection.

Since there is not (yet?) an API on Pix4d, I’m trying to solve this by reading the pix4d generated camera positions file and then tracing a ray from the camera to the GCP using the detected pixel location. I’m having some problems getting the geometry to work though, it’s not as easy as I thought.

My question is, using Pix4D, is the calibrated camera file, the best approach for going from 2D pixel position to 3D position?

I realize this is maybe out of scope for the user forum, but I figured maybe you can help me out or suggest alternative approaches

Thanks in advance

Best regards,


Hey Jose,

That’s some pretty cool work you’ve got going on there! We’re doing some development in that area ourselves.

One possible approach may be to look into the .p4d file further, it may be easier to access the image coordinate and orientation data from there as it has a more friendly format. It would even be possible to import the GCP automatically into the project by adding a GCP registry and associating the identified GCP in the image with an image mark in the .p4d file.

If you can identify the same GCP in multiple images via an alternative (possibly color?) it might not be necessary to approach it from your current angle. In larger projects, even a color identifier might not be necessary as you might only have a single GCP in each image and could identify the GCP off of it’s coordinates and the EXIF location of those images that reported a GCP. I.E. “These images were within X distance of Y GCP with only one detection and so could only see Y GCP.”

I personally would be interested in knowing more about your project having some experience in Tensorflow and OpenCV myself, if you’d be willing to share what you’re doing.

Cool stuff!

Yours Sincerely,


Hello Timothy,

Thank you! It is pretty fun

I ended up solving the geometry to get the point from projecting the rays, took me a while, but it’s the best accuracy I can get. (Thanks a ton to whoever wrote this article!) Accuracy is about 0.5-1 meters using only two rays.

I hadn’t thought of using the p4d file. It’s helpful because it contains the full image path. The camera calibration files: [ProjectName]_calibrated_external_camera_parameters.txt and [ProjectName]

Only contain the image name without the path, which is problematic when you have more than one folder with images with the same name.

About the computer vision, I actually emailed Pix4D when I created the model some months ago. I figured the easiest way to get it running and detecting GCPs was integrating it in Pix4D. I got a kind answer from one of your colleagues saying that you develop everything in-house so no integration was possible. I understand it’s not easy to integrate external efforts with the internal work of your dev team.

But since then, no GCP detection was released in Pix4D, and I think maybe you will focus on other topics, so I figured I would try to get it working separately ;)

Here is a video explaining the concept and showing the performance of the detector:

The model is trained end-to-end, no “hand made” computer vision is done, it is running on Tensorflow and takes about a second per image to run on my machine.

By the way, I didn’t know you had engineers in San Francisco, I was there this easter, what an amazing ecosystem of companies is there!






Glad you solved the geometry!

If using that you could identify the individual GCPs (by comparing computed position to GCP position within a threshold?), you could take your program a step further and create a small utility to actually place the GCP marks for each image within Pix4D projects using the .p4d file structure as well.

Then you could leverage Pix4D to provide that multi-ray accuracy from every picture you could detect that GCP in.

We’ve been testing our machine learning and target identification setups within Pix4Dtagger to make sure we can provide the best identifier we can once we fully integrate it into the product. 

If you’re still willing to discuss integration I can likely get this to the interested parties in the company, but I would also definitely understand if you want to go your own route.

Keep up the cool things!

Hello Timothy,

Sorry for the late reply. Yes, my idea is to generate an image marks csv file that can be imported in the GCP editor in Pix4D.

I am currently working on making this available via cloud service, but yes I am also open to discussing integration in Pix4D. It would make it even easier for the final user.

I can’t wait to see what Pix4D will look like in the coming years with all this machine learning!