Skip to Main Content
IBM Data and AI Ideas Portal for Customers


This portal is to open public enhancement requests against products and services offered by the IBM Data & AI organization. To view all of your ideas submitted to IBM, create and manage groups of Ideas, or create an idea explicitly set to be either visible by all (public) or visible only to you and IBM (private), use the IBM Unified Ideas Portal (https://ideas.ibm.com).


Shape the future of IBM!

We invite you to shape the future of IBM, including product roadmaps, by submitting ideas that matter to you the most. Here's how it works:


Search existing ideas

Start by searching and reviewing ideas and requests to enhance a product or service. Take a look at ideas others have posted, and add a comment, vote, or subscribe to updates on them if they matter to you. If you can't find what you are looking for,


Post your ideas

Post ideas and requests to enhance a product or service. Take a look at ideas others have posted and upvote them if they matter to you,

  1. Post an idea

  2. Upvote ideas that matter most to you

  3. Get feedback from the IBM team to refine your idea


Specific links you will want to bookmark for future use

Welcome to the IBM Ideas Portal (https://www.ibm.com/ideas) - Use this site to find out additional information and details about the IBM Ideas process and statuses.

IBM Unified Ideas Portal (https://ideas.ibm.com) - Use this site to view all of your ideas, create new ideas for any IBM product, or search for ideas across all of IBM.

ideasibm@us.ibm.com - Use this email to suggest enhancements to the Ideas process or request help from IBM for submitting your Ideas.

IBM Employees should enter Ideas at https://ideas.ibm.com


Status Delivered
Workspace Spectrum LSF
Created by Guest
Created on Jun 8, 2016

systemd service unit files for LSF

Currently, LSF provides init.d scripts to start/stop LSF services on the host. This work well for SysV based systems like RHEL6 but not so well with systemd based systems like RHEL7.

systemd allows us to control boot time dependencies with more granularity so that we can control situation like start LSF only when GPFS is up and mounted all the filesystems, etc.

init.d scripts don't allow us to set up systemd dependencies so there is no controlled way when LSF will start.

Could you please provide systemd service unit files for LSF?

  • Guest
    Reply
    |
    Jun 8, 2017

    The systemd unit file as specified (and provided in LSF 10.1 fix 432732) is not compatible with the running of MPI jobs over Infiniband since it gives a memlock limit of 64, rather than the much larger (or, typically, unlimited) values required to register memory for RDMA buffers.

    Here's the text of my support request reporting the problem, including workaround:

    Problem Details
    .
    Product or Service: Spectrum LSF Standard Edition A.1.0
    Component ID: 5725G8201
    .
    Operating System: Linux
    .
    Problem title
    hostsetup on systemd-controlled hosts disallows MPI over Infiniband
    (low memlock)
    .
    Problem description and business impact
    Description:

    When using hostsetup on SLES12.1 (and presumably other systemd-enabled
    distros), the following lsfd.service file is created:
    ---
    [Unit]
    Description=IBM Spectrum LSF
    After=network.target nfs.service autofs.service gpfs.service

    [Service]
    Type=forking
    ExecStart=/10.1/linux3.10-glibc2.17-x86_64/etc/lsf_daemons
    start
    ExecStop=/10.1/linux3.10-glibc2.17-x86_64/etc/lsf_daemons stop
    KillMode=none

    [Install]
    WantedBy=multi-user.target
    ---

    When lsf is started using 'systemctl start lsfd', the ludicrously low
    default memlock limit of 64 is used, which is then the value inherited
    by sbatchd. All jobs started on the host then inherit that as a limit.

    Problem:

    When starting an multi-node job on these nodes that uses MPI over
    infiniband (Ansys Fluent in this case), the following error is observed:

    1494948637 | fluent_mpi.17.0.0: Rank 0:1: MPI_Init: ibv_create_cq()
    failed 4
    1494948637 | fluent_mpi.17.0.0: Rank 0:1: MPI_Init: Can't initialize
    RDMA device
    1494948637 | fluent_mpi.17.0.0: Rank 0:1: MPI_Init: Internal Error:
    Cannot initialize RDMA protocol

    The RDMA protocol cannot be initialised because it requires that memory
    be pinned for use as RDMA buffers. In general it is advised to allow
    the maximum value for locked memory to be unlimited and to allow the
    MPI implementation to ensure that a sensible amount is actually
    registered.

    The Workaround:

    Modify hostsetup to insert the line 'LimitMEMLOCK=infinity', as below:

    cat > $_tmp_service_file << EOF
    [Unit]
    Description=IBM Spectrum LSF
    After=network.target nfs.service autofs.service gpfs.service

    [Service]
    LimitMEMLOCK=infinity
    Type=forking
    ExecStart=${LSF_SERVERDIR}/lsf_daemons start
    ExecStop=${LSF_SERVERDIR}/lsf_daemons stop
    KillMode=none

    [Install]
    WantedBy=multi-user.target

    EOF

    The Fix:

    Offer an option to insert the line, or not, based on usage requirements.


    The Impact:

    For me, none, as I have fixed my hostsetup. For anyone who hasn't
    figured it out, failure of all multi-node jobs that use RDMA.


    Apologies if that has already been fixed in a subsequent patch.

  • Guest
    Reply
    |
    Dec 21, 2016

    This enhancement is available for LSF913. Please download it from Fix Central URL:
    http://www-933.ibm.com/support/fixcentral/swg/selectFixes?parent=Platform%2BComputing&product=ibm/Other+software/Platform+LSF&release=All&platform=All&function=fixId&fixids=lsf-9.1.3-build431869&includeSupersedes=0

    Fix ID: lsf-9.1.3-build431869

  • Guest
    Reply
    |
    Dec 7, 2016

    Change it to right status from all customer view.

  • Guest
    Reply
    |
    Nov 29, 2016

    test

  • Guest
    Reply
    |
    Aug 19, 2016

    We will address this in a future fix pack.

  • Guest
    Reply
    |
    Jun 20, 2016

    Creating a new RFE based on Community RFE #89685 in product Platform LSF.