OpenVZ Forum: Support » CUDA support inside containers

Home » General » Support » CUDA support inside containers (How can I run CUDA workloads in multiple containers?)

Show: Today's Messages :: Show Polls :: Message Navigator
E-mail to friend

CUDA support inside containers [message #52619]

Sat, 12 November 2016 13:33

abufrejoval
Messages: 21
Registered: November 2016
Location: Frankfurt

Junior Member

I'm trying to set up a playground for machine learning engineers based on a beefy dual socket Xeon E5 with plenty of RAM and a Tesla GPU from Nvidia (pure compute GPU, no video-out).

CUDA seems to manage multi-tasking well enough, as long as limited resources such as GPU RAM are not exhausted, multiple applications run happily side-by-side.

The main issue I'm trying to solve is that most machine learning application stacks come with distinct userlands: There is Ubuntu, CentOS or plain Docker in all kinds of versions and I'd like them to co-exist happily without re-installation or exclusivity (PCI-passthrough to VM) and that's after all what container virtualization was designed to do, right?

While I've seen reports that with Docker CUDA workloads are possible, I'd always rather run Docker inside an OpenVZ container and I'd also rather give the guys the IaaS experience they are used to. They are also likely to do development work inside there and that's where Docker starts to become cumbersome.

Problem is that this new generic system resource, the GPU, today isn't quite treated like CPU, RAM or storage by OpenVZ: There is no built-in redirection layer for GPUs (BTW: How would that look with AMD APUs?).

The CUDA software evidently needs access to /dev/nvidia* to get things done and inside a container that currently seems a no-no.

Since this is a rather generic issue going forward: Any ideas how you'll want to implement that?

And is there a dirty hack which could be done to make this possible in the mean-time?

Security isn't an issue in this context: They are all friends in this case. But of course, security and strict resource allocation would be required for the production variant going forward.

Report message to a moderator

[Message index]

		CUDA support inside containers By: abufrejoval on Sat, 12 November 2016 13:33
		Re: CUDA support inside containers By: khorenko on Mon, 14 November 2016 18:04
		Re: CUDA support inside containers By: abufrejoval on Mon, 14 November 2016 22:28
		Re: CUDA support inside containers By: khorenko on Tue, 15 November 2016 07:58
		Re: CUDA support inside containers By: abufrejoval on Tue, 15 November 2016 00:38
		Re: CUDA support inside containers By: khorenko on Tue, 15 November 2016 05:45
		Re: CUDA support inside containers By: abufrejoval on Wed, 16 November 2016 02:37
		Re: CUDA support inside containers By: khorenko on Wed, 16 November 2016 06:02
		Re: CUDA support inside containers By: abufrejoval on Fri, 18 November 2016 03:37
		Re: CUDA support inside containers By: abufrejoval on Fri, 18 November 2016 04:10
		Re: CUDA support inside containers By: khorenko on Fri, 18 November 2016 18:07
		Re: CUDA support inside containers By: abufrejoval on Mon, 21 November 2016 01:11
		Re: CUDA support inside containers By: abufrejoval on Mon, 21 November 2016 03:55
		OverlayFS (was Re: CUDA support inside containers) By: abufrejoval on Mon, 21 November 2016 12:18
		Re: OverlayFS (was Re: CUDA support inside containers) By: khorenko on Mon, 21 November 2016 16:11
		Re: OverlayFS (was Re: CUDA support inside containers) By: abufrejoval on Tue, 22 November 2016 08:42
		Re: OverlayFS (was Re: CUDA support inside containers) By: khorenko on Wed, 23 November 2016 15:57
		Re: OverlayFS (was Re: CUDA support inside containers) By: abufrejoval on Wed, 23 November 2016 19:17
		Re: OverlayFS (was Re: CUDA support inside containers) By: abufrejoval on Mon, 28 November 2016 14:34

Previous Topic:	CVE-2016-7910 CVE-2016-7911
Next Topic:	Can you rebuild vz/private files?

Goto Forum:

-=] Back to Top [=-

[ Syndicate this forum (XML) ] [

] [

]

Current Time: Mon Aug 12 08:40:17 GMT 2024

Total time taken to generate the page: 0.02831 seconds