CUDA 리눅스 쪽의 Readme 파일이다.
어찌된게... windows 용으로는 readme 파일이 부실한데, 리눅스는 이렇게나 빵빵할꼬?

아무튼, SLI로 돌리면 3가지 모드로 사용이 가능하다고 한다.
    첫째는 더블 버퍼링 처럼 교대로 렌더링을 하는 방식이고
    둘째는 화면을 수직 1/n 개로 나누어 서로 렌더링하는 방식이고(물론 성능에 따라 비율이 달라질 수 있음)
    셋째는 계단현상 제거이다(통칭 안티알리아싱)


In Linux, with two GPUs SLI and Multi-GPU can both operate in one of three
modes: Alternate Frame Rendering (AFR), Split Frame Rendering (SFR), and
Antialiasing (AA). When AFR mode is active, one GPU draws the next frame while
the other one works on the frame after that. In SFR mode, each frame is split
horizontally into two pieces, with one GPU rendering each piece. The split
line is adjusted to balance the load between the two GPUs. AA mode splits
antialiasing work between the two GPUs. Both GPUs work on the same scene and
the result is blended together to produce the final frame. This mode is useful
for applications that spend most of their time processing with the CPU and
cannot benefit from AFR.

With four GPUs, the same options are applicable. AFR mode cycles through all
four GPUs, each GPU rendering a frame in turn. SFR mode splits the frame
horizontally into four pieces. AA mode splits the work between the four GPUs,
allowing antialiasing up to 64x. With four GPUs SLI can also operate in an
additional mode
, Alternate Frame Rendering of Antialiasing. (AFR of AA). With
AFR of AA, pairs of GPUs render alternate frames, each GPU in a pair doing
half of the antialiasing work. Note that these scenarios apply whether you
have four separate cards or you have two cards, each with two GPUs.

With some GPU configurations, there is in addition a special SLI Mosaic Mode
to extend a single X screen transparently across all of the available display
outputs on each GPU. See below for the exact set of configurations which can
be used with SLI Mosaic Mode.

[링크 : http://developer.download.nvidia.com/compute/cuda/3_2/drivers/docs/README_Linux.txt]


BOINC SETI@HOME 에서 SLI로 돌릴바에는 독립으로 두개로 돌리는게
효용이 좋다는 말을 들은적이 있는데(아마 영문 게시판이었던듯?)
CUDA 문서를 읽다가 문득 떠올라 검색을 해보니 메모리 할당의 특징으로 인해(이부분은 찾아봐야 하겠지만)
다른 GPU의 메모리 까지 끌어가면서 메모리 부족사태가 발생하여 예상보다 적은 수의 CUDA device만
작동이 되므로 SLI의 효용이 예상보다는 떨어지는게 아닐까 생각을 해본다.

4.3  Multiple Devices

In a system with multiple GPUs, all CUDA-enabled GPUs are accessible via the CUDA driver and runtime as separate devices. There are however special considerations as described below when the system is in SLI mode.

First, an allocation in one CUDA device on one GPU will consume memory on other GPUs. Because of this, allocations may fail earlier than otherwise expected.
(첫째, 하나의 GPU상의 하나의 CUDA 장치에 대한 메모리 할당은 다른 GPU들의 메모리를 소비할 것이다. 이러한 것으로 인해, 예상한것보다 더욱 빨리 메모리 할당이 실패할수 있을지도 모른다. - 직역
첫째, 메모리 할당을 하면 GPU상의 CUDA 장치가 다른 GPU의 메모리까지 소비하기 때문에, 생각보다 더욱 빨리 메모리 부족사태가 벌어질지도 모른다. - 의역)

Second, when a Direct3D application runs in SLI Alternate Frame Rendering mode, the Direct3D device(s) created by that application can be used for CUDA-Direct3D interoperability (i.e., passed as a parameter to cudaD3D[9|10]SetDirect3DDevice() when using the runtime API), but only one CUDA device can be created at a time from one of these Direct3D devices.

This CUDA device only executes the CUDA work on one of the GPUs in the SLI configuration.
As a consequence, real interoperability only happens with the copy of a Direct3D resource in that GPU
(note: in AFR mode Direct3D resources that must be in GPU memory are duplicated in the GPU memory of each GPU in the SLI configuration).
In some cases this is not the desired behavior and an application may need to forfeit use of the CUDA-Direct3D interoperability API and manually copy the output of its CUDA work to Direct3D resources using the existing CUDA and
Direct3D API.

[출처 : NVIDIA_CUDA_C_ProgrammingGuide.pdf 파일에서 발췌]

두번째는 interoperability가 모르니 일단 패스 -_-

검색해보니 제목도 거의 유사한 내용 -_-

Posted 31 Jan 2009 19:22:14 UTC
SLI basically combines 2 (or more) matched GPU devices into 1 logical GPU device. When in SLI mode, the system sees only 1 logical GPU and unfortunately for CUDA this means that it only has visibility to 1 physical device (not 2, 3 or 4). Disabling SLI mode for CUDA is best because it allows SETI to take advantage of each GPU as its own device.

[링크 : http://boinc.berkeley.edu/dev/forum_thread.php?id=3592]

2010/10/09 - [프로그램 사용/BOINC - seti@home] - CUDA 그리고 SLI

