Hi Tao,
A good SYCL implementation should only move data if it needs to move for functionality, so if you run a series of kernels on a device using the same data, I wouldn’t expect you to see unnecessary copies under the hood.
I’d suggest that you run your code with a profiler, and check whether there are any unexpected copies being performed by the SYCL runtime. If you see any then you can start to think about manual data movement, but you should also file a performance bug with the vendor building the implementation that you’re using.
Thanks for starting this thread!
Mike